This Page Is Inserted by IFW Operations 
and is not a part of the Official Record 



BEST AVAILABLE IMAGES 



Defective images within this document are accurate representations of 
the original documents submitted by the applicant. 

Defects in the images may include (but are not limited to): 



BLACK BORDERS 

TEXT CUT OFF AT TOP, BOTTOM OR SIDES 
FADED TEXT 
ILLEGIBLE TEXT 
SKEWED/SLANTED IMAGES 
COLORED PHOTOS 

BLACK OR VERY BLACK AND WHITE DARK PHOTOS 
GRAY SCALE DOCUMENTS 



IMAGES ARE BEST AVAILABLE COPY. 



As rescanning documents will not correct images, 
please do not report the images to the 
Image Problem Mailbox. 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

DECLARATION OF JOHN C. ROCKETT, Ph.D. 
UNDER 37 C.F.R. § 1.132 

I, JOHN COUGHLIN ROCKETT III, Ph.D., declare and 
state as follows: 

1. Since 1995 I have been engaged full-time in 
molecular toxicology research, with an emphasis on the 
application of expression profiling techniques, including but 
not limited to nucleic acid microarray expression profiling 
techniques, to studies of the mechanisms of toxicant action 
and to the design of assays to monitor toxicant exposure. 

2. My curriculum vitae, including my list of 
publications, is attached hereto as Exhibit A. 

3. For the past 5 years, my work has focused 
primarily on analyzing the effects of potentially hazardous 
environmental agents, such as heat, water disinfectant 
byproducts, and conazole fungicides on the male reproductive 
tract. Although we are interested in the basic mechanisms of 
action of such toxicants, we also have two practical goals in 
mind: first, to identify individual agents and families of 
agents that adversely affect male reproductive development and 
function, and second, to develop methods for monitoring human 
exposure to such agents, particularly methods capable of 
identifying toxicant exposure at an early stage. 

4. I have relied on expression profiling as a 
principal approach to these goals. Expression profiling, by 



reporting the expression levels of thousands of genes 
simultaneously, gives us an opportunity to identify and group 
toxicants based on similarities in the patterns of gene 
expression they induce in cells and tissues; the gene 
expression profiles induced by treatment with known testicular 
toxins serve as standards, molecular signatures or molecular 
fingerprints as it were, against which the patterns of gene 
expression induced by agents of unknown toxicity may be 
compared and judged. In addition, gene expression profiling 
may give us the opportunity to detect toxicity before more 
gross phenotypic changes become manifest. 

5. In keeping with this research emphasis, I have 
until recently: 

served on the Microarray Technical 
Subcommittee of the United States Environmental 
Protection Agency (EPA) Genomics Task Force, and 

served on the Scientific Committee for 
the conference series on "Critical Assessment of 
Techniques for Microarray Data Analysis, " held 
annually at Duke University, Durham, NC; 

and I currently 

serve on the Technical Committee on the 
Application of Genomics to Mechanism- Based Risk 
Assessment of the International Life Sciences 
Institute's Health and Environmental Sciences 
Institute,' 

serve on the Genomics and Proteomics 
Committee of the National Health and Environmental 
Effects Research Laboratory of the EPA's Office of 
Research and Development, 

belong to the [North Carolina Research] 
Triangle Array Users Group, 



belong to the Molecular Biology 
Speciality Section of the Society of Toxicology, 
and 

belong to the Triangle Consortium for 
Reproductive Biology. 

In addition, I am the principal investigator on a cooperative 
research and development agreement (CRADA) entitled 
"Development of a Genetic Test for Male Factor Infertility." 
Prior to this, I was a co-principal investigator on a 
materials cooperative research and development agreement 
(MCRADA) to print oligonucleotide-based microarrays; and from 
1999 - 2002, I was coinvestigator on a CRADA to develop gene 
microarrays for toxicology applications. 

6. I presume the reader's familiarity with the 
basic construction and operation of microarrays. For purposes 
of the discussion to follow, I use the phrase "nucleic acid 
microarray" and, equivalently , the term "microarray " to refer 
generically to the various types of nucleic acid microarray 
that include immobilized nucleic acid probes of sufficient 
length to permit specific binding, with minimal cross- 
hybridization, to the probe's cognate transcript, whether the 
transcript is in the form of RNA or DNA. Although this 
definition excludes microarrays having shorter probes, such as 
the 20-mer probes of arrays manufactured by Affymetrix, Inc., 
many of the comments that follow nonetheless apply to such 
microarrays as well . 

7. Although my own work with microarrays dates 
back only to 1998, and high density spotted nucleic acid 



microarrays themselves date back perhaps only to 1995 , 1 
microarrays are by no means the only, nor the first, 
expression profiling tool. As I describe in detail in my 
Xenobiotica review, 2 there are a number of other differential 
expression analysis technologies that precede the development 
of microarrays, some by decades, and that have been applied to 
drug metabolism and toxicology research, including: 
(1) differential screening; (2) subtractive hybridization, 
including variants such as chemical cross-linking subtraction, 
suppression-PCR subtractive hybridization and representational 
difference analysis; (3) differential display; (4) restriction 
endonuclease facilitated analyses, including serial analysis 
of gene expression (SAGE) and gene expression fingerprinting; 
and (5) EST analysis. 

8. In my own earlier research, I used both 
reverse-transcriptase polymerase chain reaction (RT-PCR) and 
suppression-PCR subtractive hybridization (SSH) to study 
patterns of differential gene expression caused by hepatic 
challenge with nongenotoxic and genotoxic hepatotoxins . 3 



1 Schena et al., "Quantitative monitoring of gene expression patterns 
with a complementary DNA microarray, " Science 270:467-470 (1995), attached 
hereto as Exhibit B. 

2 Rockett et al., "Differential gene expression in drug metabolism and 
toxicology: practicalities, problems and potential , " Xenobiotica 29:655-691 
(1999) (hereinafter, "Xenobiotica review"), attached hereto as Exhibit C. 

3 See, e.g., Rockett et al . , "Molecular profiling of non-genotoxic 
carcinogenesis using differential display reverse transcription polymerase 
chain reaction (ddRT-PCR) , " European J. Drug Metabolism & Pharmacokinetics 
22(4): 329-33 (1997), and Rockett et al., "Use of a suppression-PCR 
subtractive hybridization method to identify gene species which' demonstrate 
altered expression in male rat and guinea pig livers following 3 -day 
exposure to [4-chloro-6- (2 , 3-xylidino) -2-pyrimidinylthio] acetic acid," 
Toxicology 144 ( 1-3 ): 13-29 (2000), attached hereto respectively as Exhibits 
D and E. 



9. These older transcript expression profiling 
techniques provide analogous expression data, but with far 
lower throughput. 

10. It has been well-established, at least since 
the introduction of high density spotted microarrays in 1995, 
that: 

(i) each probe on the microarray, with 
careful design and sufficient length, and with 
sufficiently stringent hybridization and wash 
conditions, binds specifically and with minimal 
cross-hybridization, to the probe's cognate 
transcript; 

(ii) each additional probe makes an 
additional transcript newly detectable by the 
microarray, increasing the detection range, and 
thus versatility, of this analytical device for 
gene expression profiling; 4 

(iii) it is not necessary that the 
biological function be known in order for the gene, 



4 The compelling logic of this proposition has likely motivated the 

remarkably rapid progress from the earliest high density spotted arrays in 
1995 (Schena et al . , "Quantitative monitoring of gene expression patterns 
with a complementary DNA microarray," Science 270:467-470 (1995), attached 
hereto as Exhibit B) , to the first whole genome arrays in 1997 (Lashkari et 
al., "Yeast microarrays for genome wide parallel genetic and gene 
expression analysis," Proc. Natl. Acad. Sci. USA 94 (24) : 13057-62 (1997) and 
DeRisi et al. t "Exploring the metabolic and genetic control of gene 
expression on a genomic scale," Science 278 ( 5338 ): 680-6 (1997), attached 
hereto as Exhibits F and G, respectively) , to the concurrent announcement 
by two companies earlier this month of their respective commercial 
introductions of single chip human whole genome arrays (Pollack, "Human 
Genome Placed on Chip; Biotech Rivals Put it Up for Sale, " The New York 
Times, Thursday, October 2, 2003 (Business Day), attached hereto as 
Exhibit H; "Agilent Technologies ships whole human genome on single 
microarray to gene expression customers for evaluation, " Press Release, 
Agilent Technologies, October 2, 2003, attached hereto as Exhibit I; 
"Affymetrix Announces Commercial Launch of Single Array for Human Genome 
Expression Analysis; More Than 1 Million Probes Analyze Expression Levels 
of Nearly 50,000 RNA Transcripts and Variants on a Single Array the Size of 
a Thumbnail," Press Release, Affymetrix, October 2, 2003, attached hereto 
as Exhibit J) . 



or a fragment of the gene, to prove useful as a 
probe on a microarray to be used for expression 
analysis ; 

(iv) failure of a probe to detect changes 
in expression of its cognate gene does not diminish 
the usefulness of the probe on the microarray; and 

(iv) failure of a probe to detect a 
particular transcript in any single experiment does 
not deprive the probe of usefulness to the 
community of users who would use this research 
tool . 

These principles also apply to transcript expression profiling 
techniques that antedate the development of high density 
spotted microarrays, and accordingly were well-understood 
prior to 1995. 

11. Moreover, expression profiling is not limited 
to the measurement of mRNA transcript levels. It is widely 
understood among molecular and cellular biologists that 
protein expression levels provide complementary profiles for 
any given cell and cellular state. Although I cannot claim 
credit for having coined the phrase, I have written that the 
difference between transcript expression profiling and protein 
expression profiling is that " transcriptomics indicates what 
should happen, and proteomics shows what is happening. " 5 

12. For decades, such protein expression profiles 
have been generated using two dimensional polyacrylamide gel 



5 Rockett, "Macroresults through Microarrays," Drug Discovery Today 

7:804 - 805 (2002) (emphasis added), attached hereto as Exhibit K. 



electrophoresis (2D- PAGE) , and used, among other things, to 
study drug effects. 6 

13. Although the protein expression profiles 
produced by 2D- PAGE analysis are analogous to the transcript 
expression profiles provided by nucleic acid microarrays , an 
even closer analogy is perhaps offered by antibody 
microarrays; as I note in my Drug Discovery Today commentary, 
such antibody microarrays date back to the work of Roger Ekins 
in the mid- to late-1980s. 7 

14. The principles in paragraph 10 also apply to 
protein expression profiling analyses, particularly to 
analyses performed using antibody microarrays. Thus, as with 
nucleic acid microarrays, the greater the number of proteins 
detectable, the greater the power of the technique; the 
absence or failure of a protein to change in expression levels 
does not diminish the usefulness of the method; and prior 
knowledge of the biological function of the protein is not 
required. As applied to protein expression profiling, these 
principles have been well understood since at least as early 
as the 1980s. 

15. Both gene and protein expression profiling are 
particularly useful to the toxicologist , especially in the 
pharmaceutical industry. Accordingly, I made the following 



See, e.g., Anderson et al . , "A two-dimensional gel database of rat 
liver proteins useful in gene regulation and drug effects studies," 
Electrophoresis 12:907 - 930 (1991), attached hereto as Exhibit L. 

See Ekins et al . , J. Bioluminescence Chemi luminescence 5:59-78 
(1989); Ekins et al . , Clin. Chem. 37: 1955-1965 (1991); and Ekins, U.S. 
Patent Nos . 5,432,099, 5,807,755, and 5,837,551, attached hereto 
respectively as Exhibits M to Q. 



statements in my Xenobiotica review, written in the summer 
1998: 



[I]n the field of chemical-induced 
toxicity, it is now becoming increasingly obvious 
that most adverse reactions to drugs and chemicals 
are the result of multiple gene regulation, some of 
which are causal and some of which are casually- 
related to the toxicological phenomenon per se. 
This observation has led to an upsurge in interest 
in gene-profiling technologies which differentiate 
between the control and toxin-treated gene pools in 
target tissues and is, therefore, of value in 
rationalizing the molecular mechanisms of 
xenobiotic-induced toxicity. 

Knowledge of toxin- dependent gene 
regulation in target tissues is not solely an 
academic pursuit as much interest has been 
generated in the pharmaceutical industry to harness 
this technology in the early identification of 
toxic drug candidates, thereby shortening the 
developmental process and contributing 
substantially to the safety assessment of new 
drugs . 

For example, if the gene profile in 
response to say a testicular toxin that has been 
well-characterized in vivo could be determined in 
the testis, then this profile would be 
representative of all new drug candidates which act 
via this specific molecular mechanism of toxicity, 
thereby providing a useful and coherent approach to 
the early detection of such toxicants. 

Whereas it would be informative to know 
the identity and functionality of all genes up/down 
regulated by such toxicants, this would appear a 
longer term goal, as the majority of human genes 
have not yet been sequenced, far less their 
functionality determined. However, the current use 
of gene profiling yields a pattern of gene changes 
for a xenobiotic of unknown toxicity which may be 
matched to that of well-characterized toxins, thus 
alerting the toxicologist to possible in vivo 
similarities between the unknown and the 
standard. . . . 



Despite the development of multiple 
technological advances which have recently brought 
the field of gene expression profiling to the 
forefront of molecular analysis, recognition of the 
importance of differential gene expression and 
characterization of differentially expressed genes 
has existed for many years. 



16. As noted in the preceding excerpt from my 
Xenobiotica review, expression profiling in toxicology studies 
yield patterns of changes that are characteristic of an agent 
of unknown toxicity, which patterns may usefully be matched to 
those of well-characterized toxins. 

17. In the context of such patterns of gene 
expression, each additional gene-specific probe provides an 
additional signal that could not otherwise have been detected, 
giving a more comprehensive, robust, higher resolution — and 
thus more useful pattern than otherwise would have been 
possible. 8 

18. It is my opinion, therefore, based on the state 
of the art in toxicology at least since the mid-1990s and 
as regards protein profiling, even earlier -- that disclosure 
of the sequence of a new gene or protein, with or without 
knowledge of its biological function, would have been 



8 In a sense, each gene-specific probe used in such an analysis is 
analogous to a different one of the many parts of an engine, with each 
individual part, or subcombinations of such parts, deriving at least part 
of their usefulness from the utility of the completed combination, the 
functioning engine. 



sufficient information for a toxicologist to use the gene 
and/or protein in expression profiling studies in toxicology. 

19 . The statements made in this declaration 
represent my individual views and are not intended to 
represent the opinion of my employer, the United States 
Environmental Protection Agency, or of any other branch of the 
federal government. Other than my current engagement to 
provide this declaration, I have neither had, nor currently 
have, financial ties to, or financial interest in, Incyte 
Corporation. I am not myself an inventor on any patent 
application claiming a gene or gene fragment. 

20. I declare further that all statements made 
herein of my own knowledge are true and that all statements 
made on information and belief are believed to be true, and 
further that these statements were made with the knowledge 
that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under 
Section 1001 of Title 18 of the United States Code and may 
jeopardize the validity of any patent application in which 
this declaration is filed or any patent that issues thereon. 

johnTcoughlin Rocket t III, Ph.D. Date 
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with Response dated 03/18/04 
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CURRICULUM VITAE 

PERSONAL DETAILS 



Name: 
Nationality: 
Work Address: 



John Coughlin Rockett HI 
USA 

United States Environmental Protection Agency 

National Health and Environmental Effects Research Laboratory 

Reproductive Toxicology Division (MD-72) 

Gamete and Early Embryo Biology Branch 

Research Triangle Park 

NC 27711 

USA 



Work Telephone: +001 (919) 541 2678 
Work Fax: +001 (919)541 4017 



Work E-mail: 



rockett.iohn(5).epa. gov 



Employment and Higher Education 

CURRENT POSITION (12/00-present) 
Research Biologist 

Gamete and Early Embryo Biology Branch (MD-72) 
Reproductive Toxicology Division 

National Health and Environmental Effects Research Laboratory 

US Environmental Protection Agency 

Research Triangle Park 

NC 27711 

USA 



PREVIOUS POSITIONS 

8/98-12/00: NHEERL Post-Doctoral Research Fellow, Gamete and Early Embryo Biology 
Branch, Reproductive Toxicology Division, National Health and Environmental Effects 
Research Laboratory, United States Environmental Protection Agency, Research Triangle Park, 
NC,USA. 

Supervisors: Dr Sally P. Damey (Scientific publications under Sally D. Perreault) and Dr David 
J.Dix. 

5/95-7/98: Rhone-Poulenc Post-Doctoral Research Fellow, Molecular Toxicology Group, School 
of Biological Sciences, University of Surrey, Guildford, Surrey, England. 
Supervisor: Prof. G. Gordon Gibson. 



EDUCATION 

Ph.D., 1995 - University of Warwick, Coventry, W. Midlands, England 

Title: Transforming Growth Factor-p and Immune Recognition Molecules in Oesophageal 

Cancer. 

Supervisors: Dr Alan G. Moms (University of Warwick) and Dr S. Jane Darnton (Birmingham 
Heartlands Hospital) 

B.Sc. (Hons.), 1991 - University of Warwick, Coventry, W. Midlands, England. 

Degree: Microbiology and Microbial Technology (with intercalated year in industry), Class 2i. 

Tutor: Professor Howard Dalton. 



PROFESSIONAL ACTIVITIES 



Membership of Professional Societies: 

Society of Toxicology (Inc. Molecular Biology Speciality Section) (2001 -present) 
Science Advisory Board (2001 -present) 

North Carolina Chapter of the Society of Toxicology (1999-present) 

Triangle Consortium for Reproductive Biology (1999-present) 

Triangle Array Users Group (1999-present) 

Institute of Biology (U.K.) (1 989 - present) 

British Toxicology Society (1996 - 2000) 

Biochemical Society (U.K.) (1992-1995) 

British Society for Immunology (1992-1995) 

Membership of Scientific Committees: 

International Life Sciences Institute's (BLSI) Health and Environmental Sciences Institute (HESI) 
Technical Committee on the Application of Genomics to Mechanism-Based Risk Assessment: 

• Steering Committee (5/02 -present). 

• Hepatotoxicity Working Group Vice-Chair (5/02-present). 

• Hepatotoxicity Work Group Member (5/01 -present). 

Charter member, Fertility and Early Pregnancy Work Group of the National Children's Study 
(07/01 -Present). 

National Health and Environmental Effects Research. Laboratory Distinguished Lecture Series 
Committee (July 03-present). 

U.S. Environmental Protection Agency Genomics Task Force Microarray Technical 
Subcommittee (August 03-present). 

National Health and Environmental Effects Research Laboratory Genomics and Proteomics 
Committee (NGPC) (September 03-present). 



Professional Meetings: 

Invited participant ("Observer") in Expert Panel Workshop: "The Role of Environmental Factors 
on the Onset and Progression of Puberty in Children". Organised by Serono Symposia 
International. November 6^-8*, 2003, Chicago, EL, USA. 

Joint organiser and co-chair of: "Genomic analysis of surrogate tissues for measuring toxic 
exposures and drug action", the "Innovations in Applied Toxicology" Symposium for the Society 
of Toxicology 42 nd Annual Meeting, March 9 th -13 th , 2003, Salt Lake City, UT, USA. 
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(8) John C. Rockett, David J. Esdaile and G Gordon Gibson (1999). Differential gene expression 
in drug metabolism: practicalities, problems and potential. Xenobiotica, 29(7):655-691. 
(7) MC Murphy, CN Brookes, JC Rockett, C Chapman, JA Lovegrove, BJ Gould, JW Wright and 
CM Williams (1999). The quantitation of lipoprotein lipase mRNA in biopsies of human adipose 
tissue, using the polymerase chain reaction, and the effect of increased consumption of n-3 
polyunsaturated fatty acids. European Journal of Clinical Nutrition, 53 :44 1 -447. 

(6) JC Rockett, DJ Esdaile and GG Gibson (1997). Molecular profiling of non-genotoxic 
carcinogenesis using differential display reverse transcription polymerase chain reaction (ddRT- 
PCR). European Journal of Drug Metabolism & Pharmacokinetics 22(4):329-33. 

(5) Rockett, J., Larkin, K., Damton, S., Morris, A. and Matthews, H. (1997). Five newly 
established oesophageal carcinoma cell lines: phenotypic and immunological characterisation. 
British Journal of Cancer 75(2):258-263. 

(4) J C Rockett, S J Damton, J Crocker, H R Matthews and A G Morris (1996). Lymphocyte 
infiltration in oesophageal carcinoma: lack of correlation with MHC antigens, ICAM-1, and tumour 
stage and grade. Journal of Clinical Pathology 49:264-267. 

(3) J C Rockett, S J Damton, J Crocker, H R Matthews and A G Morns (1995). Expression of HL- 
ABC and HLA-DR histocompatability antigens and intercellular adhesion molecule- 1 in 
oesophageal carcinoma. Journal of Clinical Pathology 48:539-44. 

(2) Salam M, Rockett J and Morris A (1995). The prevalence of different human papillomavirus 
types and p53 mutations in laryngeal carcinomas: is there a reciprocal relationship? European 
Journal of Surgical Oncology 21:290-296. 

(1) Salam M, Rockett J and Moms A (1995). General primer-mediated polymerase chain reaction 
for simultaneous detection and typing of HPV in laryngeal carcinomas. Clinical Otolaryngology 
20:84-88. 



(2) Articles Submitted To A Scientific Journal 

(4) John C Rockett, Judith E. Schmid, Christopher J. Luft, J. Brian Garges, M. Stacey Ricci, 
Pasquale Patrizio, Norman B. Hecht and David J. Dix. Gene Expression Patterns Associated with 
Infertility in Rodent and Human Models. * An invited submission* 

(3) Roger Ulrich, John C Rockett, G. Gordon Gibson and Syril Pettit. Evaluating the Effects of 
Methapyrilene and Clofibrate on Hepatic Gene Expression: A Collaboration Between Laboratories 
and a Comparison of Platform and Analytical Approaches. 

(2) Valerie A Baker, Helen M Harries, Jeffrey F Waring, Roger Jolly, Angus de Souza, Judith E 
Schmid, Hong Ni, Roger Brown, Roger G Ulrich and John C Rockett. Clofibrate-lnduced Gene 
Expression Changes in Rat Liver: A Cross-Laboratory Analysis Using Membrane cDNA Arrays. 
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(1) David Miller, Corrado Spadafora, David Dix, Adrian Platts, John C. Rockett, Stephen A 
Krawetz Nuclease digestion of sperm chromatin suggests a random distribution of gene sequences. 

(3) Articles In Preparation For Submission To A Scientific Journal 

(3) Spearow J, DB Tully, JC Rockett and DJ Dix. Differential testicular gene expression in mouse 
strains sensitive and resistant to endocrine disruption by estrogen. 

(2) Sally D. Perrault, John C. Rockett, Laura Fenster, James Kesner, Wendy Robbins and Steven 
Schrader. Biomarkers for Assessing Reproductive Development and Health: Part 2 - Adult 
Reproductive Health. 

(1) J. Christopher Luft, Douglas B. Tully, John C. Rockett, Judith E. Schmid and 
David J. Dix. Reproductive and genomic effects in testes from mice exposed to the water 
disinfectant byproduct bromochloroacetic acid 



(4) Book Chapters 

(4) John C. Rockett. Gene Microarrays Applied to Reproductive Toxicology. In Cunningham 
(Ed): Genetic and Proteomic Applications in Toxicity Testing, The Human Press, Totowa. In 
Preparation. * An invited submission * 

(3) John C. Rockett and David J Dix. Gene Expression Networks. In Cooper (ed-in-chief): 
Encyclopaedia of the Human Genome, Nature Publishing Group. London, New York. ISBN 0-333- 
80386-8 (2003). * An invited submission* 

(2) John C. Rockett. The Future of Toxicogenomics. In Michael Burczynski (ed): "An 
Introduction to Toxicogenomics". CRC Press. Boca Raton, London, New York, Washington D.C., 
PD299-317 (20031 * An invited submission* 

(1) J. Rockett, S. Darnton, J. Crocker, H. Matthews and A. Morris: Major Histocompatibility 
Complex (MHC) class I and U and Intercellular Adhesion Molecule (ICAM)-l expression in 
oesophageal carcinoma. Peracchia A, Rosati R, Bonavina L, Bona S, Chella B (eds): Recent 
Advances in Diseases of the Esophagus. Bologna: Monduzzi Editore, pp45-49 (1996). 



(5) Other Scientific Publications (Letters to Editors; Meeting Reports; Commentaries 
etc.) 

(1 1) John C. Rockett (2003). Probing the nature of microarray-based oligonucleotides. Drug 
Discovery Today 8(9):389. (A Letter To The Editor) * An invited submission* 

(10) John C. Rockett (2003). To confirm or not to confirm (microarray data) - that, is the question. 
Drug Discovery Today 8(8):343. (A Letter To The Editor) 
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(9B) Nazzareno Ballatori, James L. Boyer, and John C. Rockett. (2003).. Exploiting Genome Data 
to Understand the Function, Regulation and Evolutionary Origins of Toxicologically Relevant 
Genes. Environ Health Perspect. 1 1 1(6):871-5. (A Meeting Report) 

(9 A) Nazzareno Ballatori, James L. Boyer, and John C. Rockett. (2003). Exploiting Genome Data 
to Understand the Function, Regulation and Evolutionary Origins of Toxicologically Relevant 
Genes. EHP Toxicogenomics. 1 1 l(lT):61-5. (A Meeting Report) 

(8) John C. Rockett (2002). Surrogate Tissue Analysis for Monitoring the Degree and Impact of 
Exposures in Agricultural Workers. AgBiotechNet, 4:1-7 November, ABN 100. (A Review Article), 
* An invited submission* 

(7) John C Rockett (2002). Macroresults Through Microarrays. Drug Discovery Today, 7(15);804- 
805. (A Meeting Report) 

(6) John C. Rockett (2002). Chip, chip, array! Three chips for post-genomic research. Drug 
Discovery Today, 7(8);458-459. (A Meeting Report) 

(5) John C Rockett (2002). Use of Genomic Data in Risk Assessment. GenomeBio\ogy, 3(4): 
reports401 1.1-401 1.3 (http://genomebioloev.eom/2002/3/4/reports/40 1 1 /?isguard= 1) . (A Meeting 
Report) 

(4) John C. Rockett (2001). Genomic and Proteomic Techniques Applied to Reproductive 
Biology. GenomeBiology 2(9): 4020.1-4020.3 ( http://genomebiology.eom/2001/2/9/reports/4020/) . 
(A Meeting Report) 

(3) John C. Rockett (2001). Chipping away at the mystery of drug responses. The 
Pharmacogenomics Journal, 1(3);161-163. (A commentary) * An invited submission* 

(2) Rockett, John C. and Dix, David J. (1999). U.S. EPA workshop: Application of DNA arrays to 
Toxicology. Environmental Health Perspectives, 107(8):681-685. (A Meeting Report) 

(1) John C. Rockett III (1995). Immune recognition molecules and transforming growth factor 
beta-1 in oesophageal cancer. Ph.D. thesis, University of Warwick, Coventry, England. (P/z.jD. 
thesis) 



(6) Published Book, Paper and Website reviews 

(9) John C. Rockett (2002). A report on the manuscript: Systemic RNAi in C. elegans requires the 
putative transmembrane protein SID-1. Winston WM, Molodowitch C, Hunter CP. Science. 2002 
295:2456-2459. GenomeBiology, 3(7):reports0034 
http://genomebiologvxom/2002/3/7/reports/0034/ 



(8) John C Rockett (2001). A report on the manuscript: Genetic rescue of an endangered mammal 
by cross-species nuclear transfer using post-mortem somatic cells. P Loi , .et al., Nat Biotechnol. 
2001, 19:962-964. GenomeBiology, 3(l):reports0006. 
fhttp://genomebioloevxom/2001/3/l/reports/0006/) . 

(7) John C Rockett (2001). A report on the manuscript: Molecular Classification of Human 
Carcinomas by Use of Gene Expression Signatures. A Su et al., Cancer Res. 2001 61:7388-7393. 
Ge/iomeBiology, 3(l):reports0005. (http://genomebiologvxom/2001/3/l /reports/0005/^ . 

(6) John C Rockett (2001). A report on the manuscript: Genetic evidence for two species of 
elephant in Africa A Roca et al., Science. 2001 Aug 24;293(5534):1473-7. GenomeBiology, 
2(12):reports0045. fhttp://www.genomebiology.com/2001/2/12/reports/0045A 

(5) John C. Rockett (2001). A report on the manuscript: Extensive genetic polymorphism in the 
human CYP2B6 gene with impact on expression and function in human liver. T Lang et al., 
Pharmacogenetics, 2001, 1 1(5):399-415. GenomeB\o\ogy y 2(12):reports0044. 
nittp://www.genomebiology.com/2001/2/12/reports/0044/) . 

(4) John C. Rockett (2001). A report on the manuscript: Novel Human Testis-Specific cDNA: 
molecular Cloning, Expression and Immunological Effects of the Recombinant Protein. R 
Santhanam and R K Naz, Molecular Reproduction and Development 60:1-12 (2001). 
GewomeBiology, 2(ll):reports0040. (http://genomebiology.eom/2001/2/l 1 /reports/0040/) . 

(3) John C. Rockett (2001). A report on the website: BIND - The Biomolecular Interaction 
Network Database (http://www.bind.ca/ ). GenomeBiology, 2(9): reports201 1 . 
http://www.penomebiology.eom/2001/2/9/reports/2011/ . 

(2) John C Rockett (2001). A report on the manuscript: Exploring the DNA-binding specificities 
of zinc fingers with DNA microarrays. ML Bulyk et al., Proc Natl Acad Sci USA 2001,98:7158- 
7163. GewomeBiology, 2(10): reports0032. (http://genomebiology.eom/2001/2/10/reports/0032/) . 

(1) J Rockett (1996). A Book Review on: "Cell Adhesion and Cancer" (Eds., Hogg N. and Hart I.). 
Clinical Molecular Pathology 49(1):M64. * An invited submission* 



(7) Published Abstracts of Poster and Oral Presentations 

(17) Amber K. Goetz, Wenjun Bao, Judith E. Schmid, Carmen Wood, Hongzu Ren, Deborah S. 
Best, Rachel N. Murrell, John C. Rockett, Michael G. Narotsky, Douglas C. Wolf, Douglas B. 
Tully, David J. Dix* Gene Expression Profiling in Testis and Liver of Mice to Identify Modes of 
Action of Conazole Toxicities. Society of Toxicology 43 rd Annual Meeting, March 21 sl -25 lh , 2004, 
Baltimore, MD, USA. Toxicological Sciences. (Submitted) 

(16) Jane Gallagher, Theresa Lehman, Ramakrishna Modali, Scott Rhoney, Marien Clas, Jeff 
Inmon, John C Rockett, David Dix, Cindy Mamay, Suzanne Fenton, Suzanne McMaster, Stan 
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Barone Jr, Pauline Mendola and Reeder Sams. Validation of Non-Invasive Biological Samples: 
Pilot Projects Relevant to the National Children Study. Society of Toxicology 43 rd Annual Meeting, 
March 21 st -25 th , 2004, Baltimore, MD, USA. Toxicological Sciences. (Submitted) 

(15) B.S. Pukazhenthi, J. C Rockett, M. Ouyang, D.J. Dix, J.G. Howard, P. Georgopoulos, W.J. J. 
Welsh and D. E. Wildt. Gene Expression In The Testis Of Normospermic Versus Teratospermic 
Domestic Cats Using Human cDNA Microarray Analyses, Society for the Study of Reproduction 
36 th Annual Meeting, July 19 lh -22 nd , 2003, Cincinnati, OH, USA. Biology of Reproduction 68 (Supp 
1):191. 

(14) David J. Dix and John C. Rockett (2003). Genomic and Proteomic Analysis of Surrogate 
Tissues for Assessing Toxic Exposures and Disease States. Innovation in Applied Toxicology 
symposium entitled "Genomic and Proteomic Analysis of Surrogate Tissues for Assessing Toxic 
Exposures and Disease States". Society of Toxicology 42 nd Annual Meeting, March 9 th -13 lh , 2003, 
Salt Lake City, UT, USA. Toxicological Sciences 72(S- 1 ):276. 

(13) John C Rockett, Chad R. Blystone, Amber K. Goetz, Rachel N. Murrell, Judith E. Schmid 
and David J. Dix. (2003). Gene Expression Profiling Of Accessible Surrogate Tissues To Monitor 
Molecular Changes In Inaccessible Target Tissues Following Toxicant Exposure. Innovations in 
Applied Toxicology Symposium entitled "Genomic and Proteomic Analysis of Surrogate Tissues 
for Assessing Toxic Exposures and Disease States". Society of Toxicology 42 nd Annual Meeting, 
March 9*-13*, 2003, Salt Lake City, UT, USA. Toxicological Sciences 72(S-1):276. 

(12) Douglas B. Tully, J. Christopher Luft, John C Rockett, Judy E. Schmid and David J. Dix 
(2002). Effects on gene expression in testes from adult male mice exposed to the water disinfectant 
byproduct bromochloroacetic acid. Society for the Study of Reproduction 35 th Annual Meeting, July 
28-31, 2002, Baltimore, Maryland, USA. Biology of Reproduction 66 (Supp 1):223. 

(1 1) David J. Dix, Kary E. Thompson, John C. Rockett, Judith E. Schmid, Robert J. Goodrich, 
David Miller, G. Charles Ostermeier and Stephen A. Krawetz (2002). Testis and spermatazola RNA 
profiles of normal fertile men. Society for the Study of Reproduction 35 th Annual Meeting, July 28- 
31, 2002, Baltimore, Maryland, USA. Biology of Reproduction 66 (Supp 1):194. 

(10) Asa J. Oudes, John C. Rockett, David J. Dix and Kwan Hee Kim (2002). Identification of 
retinoic acid induced genes in mouse testis by cDNA microarray analysis. 27 th Annual Meeting of 
the American Society of Andrology, 4/24-27/02. J. Andrology Supplement March/ April. 

(9) John C Rockett, Robert J. Kavlock, Christy Lambright, Louise G. Parks, Judith E. Schmid, 
Vickie S. Wilson and David J. Dix (2002). Use of DNA arrays to monitor gene expression in blood 
and uterus from Long-Evans rats following 1 7-p-estradiol exposure - a new approach to 
biomonitoring endocrine disrupting chemicals using surrogate tissues. Toxicological Sciences 66(1): 
Abstract No. 1388. 

(8) David J. Dix and John C Rockett (2002). Genomic analysis of the testicular toxicity of 
haloacetic acids. Platform presentation at the symposium, "Defining the cellular and molecular 



ii 



mechanisms of toxicant action in the testis". Toxicological Science 66 (1): Abstract No.848. 

(7) JC Rockett, JC Luft, JB Garges and DJ Dix (2001). The reproductive effects of the water 
disinfectant byproduct bromochloroacetate on juvenile and adult male mice. Toxicological 
Sciences, 60(1):250. 

(6) Tarka DK, Klinefelter GR, Rockett JC, Suarez JD, Roberts NL and Rogers JM (2001). Effect 
of gestational expsore to ethane dimethane sulfonate (EDS), bromochloroacetic acid (BCA) and 
molinate on reproductive function in CD-I male mice. Toxicological Sciences, 60 (1):250. 

(5) Garges JB, Rockett JC and Dix DJ (2001). Developmental and reproductive phenotype of mice 
lacking stress-inducible 70 kDa heat shock proteins (Hsp70s). Toxicological Sciences, 60 (1):383. 
(4) D Dix, J Rockett, J Luft, J Garges, M Ricci, P Patrizio and N Hecht (2000). Using DNA 
microarrays to characterise gene expression in testes of fertile and infertile humans and mice. 
Biology of Reproduction, 62 (sl);227. 

(3) J Luft, J B Garges, J Rockett and D Dix (2000). Male reproductive toxicity of 
bromochloroacetic acid in mice. Biology of Reproduction, 62 (sl);246. 

(2) Rockett, JC, Garges, JB and Dix, DJ (2000). A single heat-shock of juvenile male mice causes 
a long-term decrease in fertility and reduces embryo quality. Toxicological Sciences 54 (1):365. 

(1) JC Rockett, SJ Darnton, J Crocker, HR Matthews and AG Morris (1994). Major 
Histocompatability (MHC) class I and II and intercellular adhesion molecule (ICAM)-l expression 
in oesophageal carcinoma (OC). Immunology 83 (sl):64. 



(8) Invited Oral Presentations 

(10) John C. Rockett and Gary M Hellmann. To confirm or not to confirm (microarray data) - 
that is the question. Seminar for EPA/NHEERL Genomics and Proteomics Committee's ArrayQA 
forum, August 25 th , 2003, RTP, NC, USA. 

(9) John C. Rockett. "Biomonitoring Toxicant Exposure and Effect Using Toxicogenomics and 
Surrogate Tissue Analysis". Seminar for Division of Epidemiology, Statistics and Prevention 
Research, National Institute of Child Health and Development, May 29 lh , 2003, Rockville, MD, 
USA. 

(8) John C. Rockett. "Genomics and Proetomics: New Toxicity Testing". Platform presentation at 
US EPA Regional Risk Assessors Annual Conference, April 28 lh - May 2 nd , 2003, Stone Mountain, 
GA, USA. 

(7) John C Rockett, Chad R. Blystone, Amber K. Goetz, Rachel N. Murrell, Judith E. Schmid and 
David J. Dix. "Gene Expression Profiling Of Accessible Surrogate Tissues To Monitor Molecular 
Changes in Inaccessible Target Tissues Following Toxicant Exposure" Platform presentation at 
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SoT 42 Annual Meeting symposium entitled "Genomic and Proteomic Analysis of Surrogate 
Tissues for Measuring Toxic Exposures and Drug Action", March 9 th - 13* 2003, Salt Lake City, 
UT,USA. 

(6) John C. Rocket! "A Toxicogenomic Approach to Surrogate Tissue Analysis". Seminar for 
Department of Environmental and Molecular Toxicology, North Carolina State University, 
September 3 rd , 2002, Raleigh, NC, USA. 

(5) John C. Rockett "Differential gene expression in toxicology: practicalities, problems and 
potential". Platform presentation at 9 th Annual Mount Desert Island Biological Laboratory 
Environmental Health Sciences Symposium: Exploiting Genome Data to Understand the Function, 
Regulation and Evolutionary Origins of Toxicologically Relevant Genes, July 10^-1 1 th , 2002, 
Salisbury Cove, Maine, USA. 

(4) John C. Rockett, Leroy Folmar, Michael J. Hemmer and David J. Dix. "Arrays for 
biomonitoring environmental and reproductive toxicology". Platform Presentation at Macroresults 
Through Microarrays 3 - Advancing Drug Development, April 29 lh -May 1 st , 2002, Boston, MA, 
USA. 

(3) John C. Rockett, Sigmund Degitz, Suzanne E. Fenton, Leroy Folmar, Michael J. Hemmer, Joe 
E Tietge, and David J. Dix. "Use of DNA Arrays in Environmental Toxicology". Platform 
presentation at the 4 th Annual Lab-on~a-Chip and Microarrays for Post-Genomic Applications 
meeting, January 14 th - 16 th , 2002, Zurich, Switzerland. 

(2) John C. Rockett. "DNA Arrays". Seminar at EPA Molecular Biology Course, April 8 th , 1999, 
USEPA, RTP, NC, USA. 

(1) John C. Rockett "Contract Services for Array Applications". Seminar at the Triangle Array 
Users Group, May 1 st , 1 999, CUT, RTP, NC, USA. 



(9) Other Poster and Oral Presentations 

(23) John C Rockett, Wenjun Bao, Chad R. Blystone, Amber K. Goetz, Rachel N. Murrell, 
Hongzu Ren, Judith E. Schmid, Jessica Stapelfeldt, Lillian F. Strader, Kary E. Thompson and David 
J. Dix. Genomic Analysis of Surrogate Tissues for Assessing Environmental Exposures and Future 
Disease States. ILSI-HESI meeting: Toxicogenomics in Risk Assessment - Assessing the Utility, 
Challenges, and Next Steps. June 5 lh -6 th , 2003, Fairfax, VA, USA. 

(22) John C. Rockett, Wenjun Bao, Chad R. Blystone, Amber K. Goetz , Rachel N. Murrell, 
Hongzu Ren, Judith E. Schmid, Jessica Stapelfeldt, Lillian F. Strader, Kary E. Thompson and David 
J. Dix. Genomic Analysis of Surrogate Tissues for Assessing Environmental Exposures and Future 
Disease States. EPA Science Forum, May 5 lh -7 lh , 2003, Washington, D.C., USA. 
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(21) Germaine Buck, Courtney Johnson, Joseph Stanford, Anne Sweeney, Laura Schieve, John 
Rockett, Sherry Selevan and Steve Schrader. Prospective Pregnancy Study Designs for Assessing 
Reproductive and Developmental Toxicants. American Epidemiology Society Meeting, March 27 - 
28*, 2003, Atlanta, GA, USA. 

(20) John C. Rockett, Chad R. Blystone, Amber K. Goetz, Rachel N. Murrell, Hongzu Ren, Judith 
E Schmid, Jessica Stapelfeldt, Lillian F. Strader, Kary E. Thompson, Doug B. Tully, Paul Zigas 
and David J. Dix. Genomic Analysis of Surrogate Tissues for Assessing Environmental Exposures 
and Future Disease States. National Children's Study Assembly Meeting, December 16 -18 , 2002, 
Baltimore, MD, USA. 

(19) John Rockett. The Use of Gene Expression Profiling to Detect Early Biomarkers of Adverse 
Effects Prior to Clinical manifestation. National Children 's Study: Meeting of EPA Project Leaders 
- Methods Development Projects. November 20 ,h , 2002, USEPA, RTP, NC, USA. (Oral 
Presentation) 

(18) GC Ostermeier, RJ Goodrich, K Thompson, J Rockett, MP Diamond, K Collins, N1CHD 
Reproductive Medicine Network, DJ. Dix, D Miller and SA Krawetz. Defining the spermatozoal 
RNA population in normal fertile men. American Society of Reproductive Medicine October 12-17, 
2002, Seattle, WA, USA. 

(17) G Charles Ostermeier, Robert J. Goodrich, Kary Thompson, John Rockett, Michael P. 
Diamond, Karen Collins, N1CHD Reproductive Medicine Network, David J. Dix, David Miller and 
Stephen A Krawetz. RNAs isolated from ejaculate spermatozoa provide a noninvasive means to 
investigate testicular gene expression. Gordon Conference on Mammalian Gametogenesis & 
Embryogenesis, June 30 th -July 5 th , Connecticut College, New London, CT, USA. 

(16) David Dix, John Rockett, Judith Schmid, Lillian Strader, Douglas Tully. Genomic analysis of 
testicular toxicity. USEPA/NHEERL/RTD Peer Review, October 22 nd , 2001 , RTP, NC, USA, 

(15) David Dix, John Rockett, Judith Schmid, Douglas Tully. Monitoring human reproductive 
health and development through gene expression profiling. USEPA/NHEERL/RTD Peer Review, 
October 22 nd , 2001 , RTP, NC, USA. 

(14) Patrizio P, N Hecht, J Rockett, J Schmid and D Dix (2001). DNA microarrays to study gene 
expression profiles in testis of fertile and infertile men. 57th Annual Meeting of the American 
Society for Reproductive Medicine, October 20 th -25 lh , 2001 , Orlando, FL, USA. 

(13) Jimmy L. Spearow, Dale Morris, Uland Wong, Rashid Altafi, Saeed Eteiwi, Mark Stanford, 
Trevor Steams, Lorena Orozio, Angela Chen, John Rockett, Douglas Tully, David Dix and 
Marylynn Barkley. Genetic Variation In Susceptibility To The Disruption Of Testicular 
Development And Gene Expression By Pubertal Exposure To Estrogenic Agents. Third Annual 
University of California at Davis Conference for Environmental Health Scientists, Disruption of 
Developing Systems and Advances in Therapeutic Approaches August 27 th , 2001, UC Davis, CA, 
USA. 
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(12) Tarka DK, Klinefelter GR, Rocketl JC, Suarez JD, Roberts NL and Rogers JM (2001). Effect 
of gestational expsore to ethane dimethane sulfonate (EDS), bromochloroacetic acid (BCA) and 
molinate on reproductive function in CD-I male mice. North Carolina Society of Toxicology Winter 
Meeting, March 3 rd , 2001 . NIEHS, RTP, NC, USA. 

(1 1) David Dix, John Rockett, Leroy Folmar, Michael Hemmer, Sigmund Degitz, and Joseph 
Tietge (2001). Biomonitoring the Toxicogenomic Response to Endocrine Disrupting Chemicals in 
Humans, Laboratory Species and Wildlife. U.S. - Japan International Workshop for Endocrine 
Disrupting Chemicals, February 28 th -March 3 rd , 2001, Tsukuba, Japan. 

(10) John C. Rockett, Faye L. Mapp, J. Brian Garges, J. Christopher Luft, Chisato Mori and David 
J Dix David Dix (2001). The effects of hyperthermia on spermatogenesis, apoptosis, gene 
expression and fertility in adult male mice. Triangle Consortium for Reproductive Biology Annual 
Meeting, January 27 th , 2001 , RTP, NC, USA. 

(9) Gangolli E, Dix DJ, Garges J B, Rockett, JC and Idzerda RL (2000). Testosterone Regulation 
of Sertoli Cell genes. II th International Congress of Endocrinology, October 29 th -November 2 nd , 
2000, Sydney, Australia. 

(8) J Rockett, J Luft, J Garges, M Ricci, P Patrizio, N Hecht and D Dix (2000). Using DNA 
microarrays to characterise gene expression in testes of fertile and infertile humans and mice. 
Functional Genomics & Microarray Data Mining, August 3^-44^ 2000, Durham, NC, USA. 

(7) Rockett JC, S Ricci, P Patrizio, NB Hecht, JB Garges and DJ Dix (2000). Gene Expression in 
the Mammalian Testis. 5 th NHEERL Symposium, June 6^-8*, 2000, RTP, NC, USA. 

(6) J Luft, J B Garges, J Rockett and D Dix (2000). Male reproductive toxicity of 
bromochloroacetic acid in mice. 2000 NIEHS/NTA Biomedical Science and Career Fair, April 28 th 
2000, RTP, NC, USA. 

(5) Rockett JC, S Ricci, P Patrizio, NB Hecht, JB Garges and DJ Dix (2000). Gene Expression in 
the Mammalian Testis. Molecular Toxicology, Toxicogenomics and Associated Bioinformatics 
Applied to Drug Discovery meeting, January 11 th - 15 th , 2000,Santa Fe,NM, USA. 

(4) JC Rockett and DJ Dix (1999). Development of DNA arrays for the analysis of testis-expressed 
genes in humans and mice. The 8th Annual National Health and Environmental Effects Research 
Laboratory Open House. November 2 nd -3 rd , 1 999, RTP, NC, USA. 

(3) JC Rockett, DJ Esdaile and GG Gibson (1997). Molecular profiling of non-genotoxic 
carcinogenesis using differential display reverse transcription polymerase chain reaction (ddRT- 
PCR). The British Toxicology Society Annual Meeting, April 19 lh -22 nd , 1998, University of Surrey, 
Guildford, Surrey, England. 

(2) JC Rockett, DJ Esdaile and GG Gibson (1997). Molecular profiling of non-genotoxic 
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carcinogenesis using differential display reverse transcription polymerase chain reaction (ddRT- 
PCR). Poster presentation at Symposium on Drug Metabolism: Towards the next Millennium. 
August 26*-28 ,h ,1997, London King's College, London, England. 

(1) J Rockett, S Damton, J Crocker, H Matthews and A Morris: Major Histocompatibility Complex 
(MHC) class I and D and Intercellular Adhesion Molecule (ICAM)-l expression in oesophageal 
carcinoma. Oral presentation at The 6th World Congress of the International Society for Diseases of 
the Esophagus, August 23 rd -26 th , 1995, Milan, Italy. 
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Exhibit B of Rockett Declaration 
with Response dated 03/18/04 
InUSSN: 10/031,904 



Arflp sequence toiowing Ser 400 and occurs within 
the domain c*A*lp that shows homotogy with hIDe 
(7<). To delete the comptete STE23 sequence and 
create the sta23L:.iJRA3 rnutaton , pofymerase chain 
reaction (PCR) prjmn (5'-TCG GAAGACCTCAT- 
TCTTGCTCATTTTGATATTGCTC- TGTAGAT7G- 
TACTGAGAGTGCAC-3' : and 5' -GOT ACAAACAGC- 
GTCGACTTGAATGCCCCGACATCTTCQACTGT- 
GOGGTATTTCACAOCG-3') wee© used to ampfify 
the sequence d pRS3i6. and the reaction 
product was transformed into yeast tor one-step gene 
replacement |R Rothsteh. Methods Bvywol. 194. 
281 (1991 JJ. To create the axil &r:LEU2 nuatton con- 
tained on p1 14, a 5j0-kb Set I fragment from pAtf. / 
was ctoned into pOCl9. and an internal Hpa 
Mho I fragment was replaced with a fragment. 
To construct the sta23A:±BJ2 a*©*© ( a delation cor- 
respondng to 931 amino adds) canted on pi 53. a 
LBJ2 fragment was used to replace the 244b Pmi 
I-Ed136 1 fragment of S7E23. wnich occurs Wthh a 
6.2-Kb rtnd W-Bgl 0 genomic imgmen carried on 
pSP72 (Promega). To create YEpMFA7, a 1.64b 
Bam HI fragment containing MFA 1. frompKKl6(K. 
KuchJer. R E. Sterne. J. Tnomer. fiMBO J 8, 3973 
n9B^,wasigatadritothBBamHlsHeofYEp35l p. 
E. HQ, A. M. Myers. T. J. Koemer. A. Tzagotofl. VeasT 
Z 163 (1986)]. 

24. J. Chant and I Herskowta, Caff 65. 1203 (1991). 

25. B. W. Matthews, Act Chan, Res. 21. 333 (1988). 

26. K. Kuchfer. H. G. DoNman, J. Tnorner. J. Cet fibi 
120. 1203 (1993); R. Koing and C. P. HoUenberg, 
£M80J. 13. 3261 (1994J; C. Berkower. D. Loeyza, 
S. Mchaete, Ma/. So/. Cef 5, 1 1 B5 (1994). 

27. A. Bender end J. R Pringle. Proc. NatL Acad. ScL . 
USA 86. 9976 (1989); J. Chant. K. Corrado, J. R 
Pringle, L HerstowrtL Cef 65. 1213 (1991); S. 
Powers. E. Gonzales, T. Chrtstensea J. Cuban. D. 
Broefc. /bid., p. 1225; K O. Park. J. Chant. I. Her- 
skowttz. Nature 365. 269 (1993); J. Chant. Trends 
Genet 10, 328 (1994); and J. R Pringte. J. 



uct pC225 Is a KS+ (Stratagene) plasmJd ccntartng 
• 0.5-kb Bam r-fr-Sst I fragment from pAAfl. I. SUbstK 
tution rrwations of the proposed acta ska of Axnp 
were created wfth the use of pC225 and sfte-speofic 
rnutagenesis nvofvhg appfopriate synthetic oigonu- 
Oeotides (SA/7-H684. 5'-GTGCTCACAAAGCGCT« 
GCCAAAOOGGC-3'; axf1-E7lA, 5'-AAGAATCAT- 
GTGCGCACAAAGGTGCGC-3'; and axt1-€7lD. 5'. 
AAGAATCATGTGATCACAAAGGTGCGC-3 ■). The 
mutations were confirmed by sequence analyse. Af- 
ter mutagenesis, the 0.44b Bam hW-Msc I fragment 
from the mutagenced pC225 ptasrrtds was trans- 
ferred htopAXI 7 to rjoateasetofcRS3l6pbsrT»ds 
carrying different AXL7 alleles, pi 24 fpdl-HS8A) 
P130 1flxn-e7iA). and p132 (ax/7-E7/0). Simlarty. a 
set of HA-tagged afleias carried on YEp352 were cn> 
atedafter replacement of the pi5l Bam HW*sc I 
fragment to generate piei tax/7 -£77A), pi 62 tax/7- 
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CeBBiol. 12S. 751 (1995); J. Chant. M. Mischke.E. 
MitcheO. I. Herskowta, J. R Pringle. ft*?., p. 767. 

28. G. F. Spregue Jr.. Methods. EnzymoL 194. 77 
(1991). 

29. Single t ott er abbreviations for the amno acid resi- 
ouas are as follows: A. Ala; C, Cys; 0, Asp; E, GJu; F, 
Phe: a Gr/; H. His; I. fie; K. Lys: L. Uu; M. Met: N. 
Asa P. Pro; O. Gk\ R. Are; S. Sen T. Thr. V, vaf; W, 
Trp;andY.Tyr. 

30. A W303 1A derivative. SY2625 (MATa ure3-7 kU2-3. 

search SY2625 derivatives for the mating assays, se- 
creted pheromcne assays, and the putse-chase exper- 
iments induced tie talcwfrg strains: Y49 (sfa22-7). 
Y115 (mte7A.-.lflJB). Y142 (fixfh\VM3). Y173 
frt1£L±BJZ}. Y220 sfe23^7t«A3). Y221 

iste23£L\iJRA3), Y231 1&!1X'±EU2 sle&LuLBJZi. 
and Y233 &e23tr±BJ& MA To derivatives of 
SY2625 hciuded the toflowing strains: Y199 
(SY2625 made MATo), Y278 (sfe22-7). Yl95 
(mtefA:.l£U2). Y196 (ajrfJA.vLEU2). and Y197 
(aJtfI:.l/«A5). The EG 123 (MA 7a too? ura3 trpl cam 
Hs4) genetic background was used to create a set of 
strains fa analysis of bud site selection. EG 123 de- 
rivatives included the fc*owfrig strains: Y175 
(ax/7 &?L£U2), Y223 (axU.vUAAd), Y234 {ste23te 
LBJZL end Y272 faxf7A.vt£U? ste23lr:LEU2). 
MA To derivatives of EG 123 included the to! owing 
strains: Y214 (EG 123 made MA To) and Y293 
<axf7A.?L£l/2). AD strains were generated by means 
of standard genetic or molecular methods Involving 
the aponoonate constructs {23). In particular, the ax/r 
ste23 double mutant strains were created by cross- 
ing of the appropriate MA 7a ste23 and MA To ax/7 
mutants, followed by sponiation of the resuttant dp- 
bid and isolation of the double mutant from nonpe* 
rental tf-type tetrads. Gene disruptions were con- 
firmed with either PCR or Southern (DMA) analysis. 
31. pl29 e b YEp352 (J. E. KO, A ML Myers. T. J. Ko- 
emer. A Tjagotoff, Veasf 2, 1 63 (1986)) ptasmid con- 
taining a 5.5-kb Saf I fragment of pAXL/.p 151 was 
derived from p1 29 by irtsertion of » Inker at the Bglfi 
site withh AXL 7. which ted to an irv-frame insertion of 
the hemaggkjUnln (HAJ epitope (DQfYPYDvPOYA) (29) 
beTvyeenamfroacids854ande55oftheA>a7 prod- 



Quantitative Monitoring of Gene Expression 
Patterns with a Complementary DNA Microarray 

Mark Schena,* Dari Shalon/t Ronald W. Davis, 
Patrick O. Brown* 

A high-capacity system was developed to monitor the expression of many genes In 
parallel. M.croarrays prepared by high-speed robotic printing of complementary DNAs on 
glass were used for quantitative expression measurements of the corresponding genes 

^l 5 ! 0 ' JT 3 " f0rmat and W9h densrty of the ^ys. hybridization volumes of 2 
microlrters could be used that enabled detection of rare transcripts In probe mixtures 
aenved from 2 micrograms of totaJ cellular messenger RNA. Differential expression 

^^X^^ QeneS W " C mad6 by mean8 ^ simurta ~««. tw^kx 



The remporal, devclopmencal, topographi- 
cal, histological, and physiological pancrns 
in which a gene is expressed provide clues to 
its biological role. The large and expanding 
database of complementary DNA (cDNA) 
sequences from many organisms (J) presents 
the opportunity of defining these patterns at 
the level of the whole genome. 

For these studies, we used the small flow- 
ering plant Arabidopsis thaiiana as a model 
organism. Ardbuiopsis possesses many ad- 
vantages for gene expression analysis, in- 
cluding the fact that it has the smallest 
genome of any higher eukaryote examined 
to date (2). Forty-five cloned Arotidopjts 
cDNAs (Table 1), including 14 complete 
sequences and 31 expressed sequence tags 
(ESTs), were used as gene-specific targets. 
We obtained the ESTs by selecting cDNA 
clones at random from an Arabidopsis 
cDNA library. Sequence analysis revealed 
that 28 of the 31 ESTs matched sequences 

M. Schena and R. W. Davis, Department of Biochemistry. 
Beckman Center. Stanford University Medical Center. 
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in the database (Table 1). Three additional 
cDNAs from other organisms served as con- 
trols in the experiments. 

The 48 cDNAa, averaging -1.0 kb, 
were amplified with the polymerase chain 
reaction (PCR) and deposited into indi- 
vidual wells of a 96-well microticer plate. 
Each sample was duplicated in two adja- 
cent wells to allow the reproducibility of 
the arraying and hybridization process to 
be tested. Samples from the microliter 
plate were printed onto glass microscope 
slides in an area measuring 3.5 mm by 5.5 
mm with the use of a high-speed arraying 
machine (3). The arrays were processed by 
chemical and heat treatment to attach the 
DNA sequences to the glass surface and 
denature them (3). Three arrays, printed 
in a single lot, were used for the experi- 
ments here. A single microliter plate of 
PCR products provides sufficient material 
to print at least 500 arrays. 

Fluorescent probes were prepared from 
total Arobidopsij mRNA (4) by a single 
round of reverse transcription (5). The Aro- 
bidopsis mRNA was supplemented with hu- 
man acetylcholine receptor (AChR) mRNA 
at a dilution of 1 : 10,000 (w/w) before cDNA 
synthesis, to provide an internal standard for 
calibration (5). The resulting fluorescently 
labeled cDNA mixture was hybridized to an 
array at high stringency (6) and scanned 

467 




with a User (3). A high-sensitivity scan gave 
signals that saturated the detector at nearly 
all of the AroWopsu target sites (Fig. 1A). 
Calibration relative to the AChR mRNA 
standard (Fig. 1A) established a sensitivity 
limit of - 1 : 50,000. No detectable hybridiza- 
tion was observed to either the rat glucocor- 
ticoid receptor (Fig. 1A) or the yeast TRP4 
(Fig. 1A) targets even at the highest scan- 
ning sensitivity. A rnoderate-sensirivity scan 



A High sensitivity 

I 2 3 C 5 C T 6 9 10 It 12 

a * S • . C <> s.» % • * : . • . 

b ■, . • •-. C v, 

c • * C G -* 

d .v '■ : ■ C - * 

c * v C * •.. 

f ; a * .. \\ •., n . - ■ . : • . - 



of the same amy allowed linear detection of 
the more abundant transcripts (Fig. IB). 
Quantitation of both scans revealed a range 
of expression levels spanning three orders of 
magnitude for the 45 genes tested (Table 2). 
RNA blots (7) for several genes (Fig. 2) 
corroborated the expression levels measured 
with the microarray to within a factor of 5 
(Table 2). 

Differential gene expression was tnvesri- 
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gated with a simultaneous, two-color hy- 
bridization scheme, which served to mini- 
mixe experimental variation inherent in the 
comparison of independent hybridizations. 
Fluorescent probes were prepared from two 
mRNA sources with the use of reverse tran- 
scriptase in the presence of fluorescein- and 
lissamine-labcled nucleotide analogs, re- 
spectively (5). The two probes were then 
mixed together in equal proportions, hy- . 
bridiied to a single array, and scanned sep- 
arately for fluorescein and lissamine emis- 
sion after independent excitation of the two 
fluorophores (3). 

To test whether overexpression of a sin- 
gle gene could be detected in a pool of total 
ATobidopsu mRNA, we used a microarray to 
analyze a transgenic line overexpressing the 
single transcription factor HAT4 (8). Fluo- 
rescent probes representing mRNA from 
wild-type and HAT^ -transgenic plants were 
labeled with fluorescein and lissamine, re- 
spectively; the two probes were then mixed 
and hybridized to a single array. An intense 
hybridization signal was observed at the 
position of the HAT4 cDNA in the lissa- 
mine-specific scan (Fig. ID), but not in the 
fluorescein-specific scan of the same array 
(Fig. 1C). Calibration with AChR mRNA 
added to the fluorescein and lissamine 
cDNA synthesis reactions at dilutions of 
1:10,000 (Fig. 1C) and 1:100 (Fig. ID), 
respectively, revealed a 50-fold elevation of 
HAT4 mRNA in the transgenic line rela- 
tive to its abundance in wild-type plants 
(Table 2). This magnitude oi HAT4 over- 
expression matched that inferred from the 
Northern (RNA) analysis within a factor of 
2 (Fig. 2 and Table 2). Expression of all the 
other genes monitored on the array differed 
by less than a factor of 5 berween HAT4- 
transgenic and wild-type plants (Fig 1, C 
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<*> O ft 
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Fig. 1. Gene expression monitored with the use of cDNA miaoarrays. Fluorescent scans represented in 
pseuoccotor correspond to hybridization intensities. Color bars were cafibrated from the signal obtained 
with the use of known corxx^tioro of human AChR m 
left ers w tre exes mark the poertkx! d 

with ftuomsc^n-iabeted cDNA derived from wtld-type plants. <B) Same array as in (A) but scanned at 
moderate sensitivity. (C and D) A single array was probed with a 1:1 mixture of fluoresced labeled cONA 
from wad-type plants and bssamm-labeled cDNA from HAT4 -transgenic plants. The single array was 
then scanned successive^ to detect the fluorescein fluorescence corresponding to mRNA from wild- type 
plants (Q end the lissamine fluorescence corresponding to mRNA from HAT4-transgenic plants fD) (E 
and F) A single array was probed with a 1 : 1 mixture of fluoresceirvlabeied cDNA from root tissue and 
bssarnine-labeled cDNA from leaf tissue. The single array was then scanned successively to detect the 
fluorescein fluorescence corresponding to mRNAs expressed in roots (E) and the Issamine fluorescence 
corresponding to mRNAs expressed in leaves (F). 
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Rg. 2. Gene expression monitored with RNA 
(Northern) blot analysis. Designated amounts of 
mRNA from wild-type and HA 74 -transgenic 
plants were spotted onto nylon membranes and 
probed with the cONAs indicated. Purified human 
AChR mRNA was used for calibration. 
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Reports 



and D, and Tabic 2). Hybridization of flu- 
oresce in-labcled glucocorticoid receptor 
cDNA (Fig. 1C) and lissamine-labeled 
TRP4 cDNA (Fig. ID) verified the pres. 
ence of the negative control targets and the 
lack of optical cross talk between the two 
fluorophores. 

To explore a more complex alteration in 
expression patterns, wc performed a second 
two-coloT hybridization experiment with 
fluorescein- and lissaminc- labeled probes 
prepared from root and leaf mRNA, respec- 
tively. The scanning sensitivities for the 
two fluorophores were normalized by 
matching the signals resulting from AChR 



mRNA, which was added to both cDNA 
synthesis reactions at a dilution of 1:1000 
(Fig. 1, E and F). A comparison of the scans 
revealed widespread differences in gene ex- 
pression between root and leaf tissue (Fig. 1, 
E and F). The mRNA from the light-regu- 
lated CABl gene was -500-fold more abun- 
dant in leaf (Fig. IF) than in root tissue 
(Fig. IE). The expression of 26 other genes 
differed between root and leaf tissue by 
more than a factor of 5 (Fig. 1, E and F). 

The HAT4-transgenic line we examined 
has elongated hypocotyls, early flowering, 
poor germination, and altered pigmentation 
(8). Although changes in expression were 



Table 1. Seouenoes contained on the cDNA microarrav Sr»wn « tho r^riti^ . 
in this study matched 8 sequence n the ctota^lS^ 

d^ieotide; ATPase. adenosine WphosphmJ^ ncotnam.de adenne 



Position 



cONA 



Function 



Accession 
number 



81.2 

03, 4 

a5.6 

87.6 

89, 10 

all, 12 

D1.2 

03, 4 

D5.6 

57.8 

O9.10 

bll.12 

C1.2 

C3.4 

c5.6 

C7.8 

c9. 10 

C11.12 

d1,2 

d3. 4 

d5. 6 

d7.8 

d9. 10 

d11,12 

e1.2 

e3.4 

e5.6 

e7,8 

e9. 10 

ell, 12 

II. 2 
13.4 
15,6 
17,8 
f9. 10 

III. 12 
91.2 
93.4 
95.6 
Q7.8 

g9. 10 

911.12 

M.2 

h3,4 

h5,6 

h7.8 

h9, 10 

hH.12 



AChR 

EST3 

EST6 

AAC1 

EST12 

EST13 

CABl 

EST17 

GA4 

EST19 

Gflf-7 

EST23 

EST29 

GBF-2 

EST34 

EST35 

EST41 

rGR 

EST42 

EST45 

HAT1 

EST46 

EST49 

HAT2 

HAT 4 

EST50 

HATS 

EST51 

HAT22 

EST52 

EST59 

KNAT1 

EST60 

EST69 

PPH1 

EST 70 

EST75 

EST 78 

ROCl 

EST82 

EST83 

EST84 

EST91 

EST96 

SARI 

EST 100 

EST 103 

TRP4 



•Proprietary sequence of SUategene (U Jdla. Caiiloma). 



Human AChR 
Actin 

NADH Dehydrogenase 

Actin 1 

Unknown 

Actin 

Chlorophyll a/b binding 
Phosphoglycerate kinase 
GibbereUic add biosynthesis 
Untoown 

G-box binding factor 1 
Elongation factor 
Aldolase 

G-box binding factor 2 
Chtoroplast protease 
Unknown 
Cataiase 

Rat gjucocorticoid receptor 

Unknown 

ATPase 

Homeobox-teucine zipper 1 
Light harvesting complex 
Unknown 

Homeobox -leucine zipper 2 
Homeobox-teucine zipper A 
Phosprxxibutokinase 
Homeobox -teuone zipper 5 
Unknown 

Homeobox-teucine zipper 22 
Oxygen evolving 
Uiknown 

Knotred-Uke homeobox 1 
RuBisCO smaU sutxrvt 
Translation elongation factor 
Protein phosphatase 1 
Uiknown 

Chtoroplast protease 
Unknown 
Cyctophflin 
GTP binding 
Unknown 
Unknown 
Unknown 
Uiknown 
Synaptobrevin 
Light harvesting complex 
Ught harvesting complex 
Yeast tryptophan biosynthesis 



H36236 

227010 

M20016 

U36594t 

T45783 

M85150 

T44490 

L37126 

U36595t 

X63894 

X52256 

T04477 

X63895 

R87034 

T14152 

T22720 

M14053 

U36596t 

J04185 

U09332 

T04063 

T76267 

U09335 

M90394 

T04344 

M90416 

233675 

U09336 

T21749 

234607 

U14174 

X14564 

T42799 

U34803 

T44621 



R65481 
L14644 
X59152 
233795 
T45278 
T13832 
R64816 
M90418 
218205 
X03909 
X04273 



observed for HAT4, Urge chan«*s in ex. 
pression were not observed for any of the 
other 44 genes we examined This was 
somewhat surprising, particularly because 
comparative analysis of leaf and root tissue 
identified 27 differentially expressed genes. 
Analysis of an expanded set of genes may be 
required to identify genes whose expression 
changes upon HAT4 overexpression; alter, 
natively, a comparison of mRNA popula- 
tions from specific tissues of wild-type and 
HAT4-transgenic plants may allow idenrj. 
f ication of downstream genes. 

At the current density of robotic printing, 
it is feasible to scale up the fabrication pro- 
cc » J° Produce arrays containing 20,000 
cDNA targee. At this density, a single array 
would be sufficient to provide gene-specific 
targets encompassing nearly the entire rep- 
ertoire of expressed genes in the Arabidopsis 
genome (2). The availability of 20,274 ESTs 
from Arabidopsis (] , 9) would provide a rich 
source of templates for such studies. 

The estimated 100,000 genes in the hu- 
man genome (10) exceeds the number of 
Arabidopsis genes by a factor of 5 (2). This 
modest increase in complexity suggests that 
similar cDNA microarrays, prepared from 
the rapidly growing repertoire of human 
ESTs (1), could be used to determine the 
expression patterns of tens of thousands of 
human genes in diverse cell types. Coupling 
an amplification strategy to the reverse 
transcription reaction (J J J could make it 
feasible to monitor expression even in 
minute tissue samples. A wide variety of 
acute and chronic physiological and patho- 
logical conditions might lead to character- 
istic changes in the patterns of gene expres- 
sion in peripheral blood cells or other easily 
sampled tissues. In concert with cDNA mi- 
croarrays for monitoring complex expres- 
sion patterns, these tissues might therefore 
serve as sensitive in vivo sensors for clinical 
diagnosis. Microarrays of cDNAs could thus 
provide a useful link between human gene 
sequences and clinical medicine. 



Table 2. Gene expression nxxiitoring by rrtcroar- 
ray and RNA blot analyses; tg. rW4-transgent C 
See Table 1 for additional gene formation. Ex- 
pression levels (w/w) were calibrated with the use 
of known amounts of human AChR mRNA. Values 
for the microarray were tetermined from rrtcroar- 
ray scans (Rg. 1); values for the RNA blot were 
determined from RNA blots (Rg. 2). 



Gene 



Expression level (w/w) 



f No match in the database: novel EST. 



CABl 
CABl (tg) 
HAT4 
HAT4 (tg) 

roc 7 

ROCl (tg) 



Microarray 


RNA blot 


1:48 


1:83 


1:120 


1:150 


1:8300 


1:6300 


1:150 


1510 


1:1200 


1:1800 


1560 


1:1300 
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Gene Therapy in Peripheral Blood 
Lymphocytes and Bone Marrow for 
ADA Immunodeficient Patients 

Claudio BordignorV Luigi D. Notarangelo. Nadia Nobili, 
Giuliana Ferrari, Giulia Casorati, Paola Panina, Evelina Mazzolari, 
Daniela Maggioni, Claudia Rossi, Paolo Servida, 
Alberto G. Ugazio, Fulvio Mavilio 

Adenosine deaminase (ADA) deficiency results in severe combined Immunodeficiency 
the first genetic disorder treated by gene therapy. Two different retroviral vectors were 
used to transfer ex vivo the human ADA minigene into bone marrow cells and peripheral 
blood lymphocytes from two patients undergoing exogenous enzyme replacement ther- 
apy. After 2 years of treatment, long-term survival of T and B lymphocytes, marrow cells, 
and granulocytes expressing the transferred ADA gene was demonstrated and resulted 
in normalization of the immune repertoire and restoration of cellular and humoral immunity. 
After discontinuation of treatment, T lymphocytes, derived from transduced peripheral 
blood lymphocytes, were progressively replaced by marrow-derived T cells in both pa- 
tients. These results indicate successful gene transfer into long-lasting progenitor cells, 
producing a functional multilineage progeny. 



Severe combined immunodeficiency asso- 
ciated with inherited deficiency of ADA 
(J) is usually fatal unless affected children 
are kept in protective isolation or the im- 
mune system is reconstituted by bone mar- 
row transplantation from a human leuko- 
cyte antigen (HL A ^identical sibling donor 
(2). This is the therapy of choice, although 
it is available only for a minority of patients. 
In recent years, other forms of therapy have 
been developed, including transplants from 
haploidentical donors (3,4), exogenous en- 
zyme replacement (5), and somatic-cell 
gene therapy (6-9). 

We previously reported a preclinical mod- 
el in which ADA gene transfer and expression 
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successfully restored immune functions in hu- 
man ADA-deficient (ADA*) peripheral 
blood lymphocytes (PBLs) in immunodefi- 
cient mice in vivo (JO, 11 J. On the basis of 
these preclinical results, the clinical applica- 
tion of gene therapy for the treatment of 
ADA" SOD (severe combined immunodefi- 
ciency disease) patients who previously railed 
exogenous enzyme replacement therapy was 
approved by our Insriturianal Ethical Com- 
mittees and by the Italian National Commit- 
tee for Btoethics (12). In addition to evaluat- 
ing the safety and efficacy of the gene therapy 
procedure, the aim of the study was to define 
the relative role of PBLs and hematopoietic 
stem cells in the long-term reconsunjtion of 
immune functions after retroviral vector-me- 
diated ADA gene transfer. For this purpose, 
two structurally identical vectors expressing 
the human ADA complementary DNA 
(cDNA), distinguishable by the presence of 
alternative restriction sites in a nrjnfuncuonal 
region of the viral long-terminal repeat 
(LTR), were used to transduce PBLs and bone 
marrow (BM) cells independently. This pro- 
cedure allowed identification of the origin of 
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Differential gene expression in drug metabolism and 
toxicology: practicalities, problems and potential 

JOHN C. ROCKETTt, DAVID J. ESDAILEJ 
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Molecular Toxicology Laboratory, School of Biological Sciences, University of Surrey, 
Guildford, Surrey, GU2 5XH, UK 

Received January 8, 1999 

1 . An important feature of the work of many molecular biologists is identifying which 
genes are switched on and ofT in a cell under different environmental conditions or 
subsequent to xenobiotic challenge. Such information has many uses, including the 
deciphering of molecular pathways and facilitating the development of new experimental 
and diagnostic procedures. However, the student of gene hunting should be forgiven for 
perhaps becoming confused by the mountain of information available as there appears to be 
almost as many methods of discovering differentially expressed genes as there are research 
groups using the technique. 

2. The aim of this review was to clarify the main methods of differential gene expression 
analysis and the mechanistic principles underlying them. Also included is a discussion on 
some of the practical aspects of using this technique. Emphasis is placed on the so-called 
'open ' systems, which require no prior knowledge of the genes contained within the study 
model. Whilst these will eventually be replaced by 'closed ' systems in the study of human, 
mouse and other commonly studied laboratory animals, they will remain a powerful tool for 
those examining less fashionable models. 

3. The use of suppression- PCR subtractive hybridization is exemplified in the 
identification of up- and down- regulated genes in rat liver following exposure to pheno- 
barbital, a well-known inducer of the drug metabolizing enzymes. 

4. Differential gene display provides a coherent platform for building libraries and 
microchip arrays of 'gene fingerprints' characteristic of known enzyme inducers and 
xenobiotic toxicants, which may be interrogated subsequently for the identification and 
characterization of xenobiotics of unknown biological properties. 



Introduction 

It is now apparent that the development of almost all cancers and many non- 
neoplastic diseases are accompanied by altered gene expression in the affected cells 
compared to their normal state (Hunter 1991, Wynford -Thomas 1991, Vogelstein 
and Kinzler 1993, Semenza 1 994, Cassidy 1995, Kleinjan and Van Hegningen 1 998). 
Such changes also occur in response to external stimuli such as pathogenic micro- 
organisms (Rohn et al 1996, Singh et al. 1997, Griffin and Krishna 1998, Lunney 
1998) and xenobiotics (Sewall et al 1995, Dogra et al 1998, Ramana and Kohli 
1998), as well as during the development of undifferentiated cells (Hecht 1998, 
Rudin and Thompson 1998, Schneider-Maunoury et al 1998). The potential 
medical and therapeutic benefits of understanding the molecular changes which 
occur in any given cell in progressing from the normal to the * altered ' state are 
enormous. Such profiling essentially provides a 'fingerprint' of each step of a 
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cell's development or response and should help in the elucidation of specific and 
sensitive biomarkers representing, for example, different types of cancer or previous 
exposure to certain classes of chemicals that are enzyme inducers. 

In drug metabolism, many of the xenobiotic-metabolizing enzymes (including 
the well-characterized isoforms of cytochrome P450) are inducible by drugs and 
chemicals in man (Pelkonen et al. 1998), predominantly involving transcriptional 
activation of not only the cognate cytochrome P450 genes, but additional cellular 
proteins which may be crucial to the phenomenon of induction. Accordingly, the 
development of methodology to identify and assess the full complement of genes 
that are either up- or down-regulated by inducers are crucial in the development of 
knowledge to understand the precise molecular mechanisms of enzyme induction 
and how this relates to drug, action. Similarly, in the field of chemical-induced 
toxicity, it is now becoming increasingly obvious that most adverse reactions to 
drugs and chemicals are the result of multiple gene regulation, some of which are 
causal and some of which are casually-related to the toxicological phenomenon per 
se. This observation has led to an upsurge in interest in gene-profiling technologies 
which differentiate between the control and toxin-treated gene pools in target tissues 
and is, therefore, of value in rationalizing the molecular mechanisms of xenobiotic- 
induced toxicity. Knowledge of toxin-dependent gene regulation in target tissues is 
not solely an academic pursuit as much interest has been generated in the 
pharmaceutical industry to harness this technology in the early identification of toxic 
drug candidates, thereby shortening the developmental process and contributing 
substantially to the safety assessment of new drugs. For example, if the gene profile 
in response to say a testicular toxin that has been well-characterized in vivo could be 
determined in the testis, then this profile would be representative of all new drug 
candidates which act via this specific molecular mechanism of toxicity, thereby 
providing a useful and coherent approach to the early detection of such toxicants. 
Whereas it would be informative to know the identity and functionality of all genes 
up/down regulated by such toxicants, this would appear a longer term goal, as the 
majority of human genes have not yet been sequenced, far less their functionality 
determined. However, the current use of gene profiling yields a pattern of gene 
changes for a xenobiotic of unknown toxicity which may be matched to that of well- 
characterized toxins, thus alerting the toxicologist to possible in vivo similarities 
between the unknown and the standard, thereby providing a platform for more 
extensive toxicological examination. Such approaches are beginning to gain 
momentum, in that several biotechnology companies are commercially producing 
'gene chips' or 'gene arrays' that may be interrogated for toxicity assessment of 
xenobiotics. These chips consist of hundreds/thousands of genes, some of which are 
degenerate in the sense that not all of the genes are mechanistically-related to any 
one toxicological phenomenon. Whereas these chips are useful in broad-spectrum 
screening, they are maturing at a substantial rate, in that gene arrays are now 
becoming more specific, e.g. chips for the identification of changes in growth factor 
families that contribute to the aetiology and development of chemically-induced 
neoplasias. 

Although documenting and explaining these genetic changes presents a 
formidable obstacle to understanding the different mechanisms of development and 
disease progression, the technology is now available to begin attempting this difficult 
challenge. Indeed, several 'differential expression analysis* methods have been 
developed which facilitate the identification of gene products that demonstrate 
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altered expression in cells of one population compared to another. These methods 
have been used to identify differential gene expression in many situations, including 
invading pathogenic microbes (Zhao et al. 1998), in cells responding to extracellular 
and intracellular microbial invasion (Duguid and Dinauer 1990, Ragno et al. 1997, 
Maldarelli et al. 1998), in chemically treated cells (Syed et al. 1997, Rockett et al. 
1999), neoplastic cells (Liang et al. 1992, Chang and Terzaghi-Howe 1998), 
activated cells (Gurskaya et al. 1996, Wan et al. 1996), differentiated cells (Hara et 
al. 1991, Guimaraes et al. 1995a, b), and different cell types (Davis et al. 1984, 
Hedrick et al. 1984, Xhu et al. 1998). Although differential expression analysis 
technologies are applicable to a broad range of models, perhaps their most important 
advantage is that, in most cases, absolutely no prior knowledge of the specific genes 
which are up- or down-regulated is required. 

The field of differential expression analysis is a large and complex one, with 
many techniques available to the potential user. These can be categorized into 
several methodological approaches, including: 

(1) Differential screening, 

(2) Subtractive hybridization (SH) (includes methods such as chemical cross- 
linking subtraction — CCLS, suppression-PCR subtractive hybridization — 
SSH, and representational difference analysis — RDA), 

(3) Differential display (DD), 

(4) Restriction endonuclease facilitated analysis (including serial analysis of gene 
expression — SAGE — and gene expression fingerprinting — GEF), 

(5) Gene expression arrays, and 

(6) Expressed sequence tag (EST) analysis. 

The above approaches have been used successfully to isolate differentially 
expressed genes in different model systems. However, each method has its own 
subtle (and sometimes not so subtle) characteristics which incur various advantages 
and disadvantages. Accordingly, it is the purpose of this review to clarify the 
mechanistic principles underlying the main differential expression methods and to 
highlight some of the broader considerations and implications of this very powerful 
and increasingly popular technique. Specifically, we will concentrate on the so- 
called 'open' systems, namely those which do not require any knowledge of gene 
sequences and, therefore, are useful for isolating unknown genes. Two Closed' 
systems (those utilising previously identified gene sequences), EST analysis and the 
use of DNA arrays, will also be considered briefly for completeness. Whilst 
emphasis will often be placed on suppression PCR subtractive hybridization (SSH, 
the approach employed in this laboratory), it is the aim of the authors to highlight, 
wherever possible, those areas of common interest to those who use, or intend to use, 
differential gene expression analysis. 

Differential cDNA library screening (DS) 

Despite the development of multiple technological advances which have recently 
brought the field of gene expression profiling to the forefront of molecular analysis, 
recognition of the importance of differential gene expression and characterization of 
differentially expressed genes has existed for many years. One of the original 
approaches used to identify such genes was described 20 years ago by St John and 
Davis (1979). These authors developed a method, termed 4 differential plaque filter 
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hybridization', which was used to isolate galactose-inducible DNA sequences from 
yeast. The theory is simple: a genomic DNA library is prepared from normal, 
unstimulated cells of the test organism/tissue and multiple filter replicas are 
prepared. These replica blots are probed with radioactively (or otherwise) labelled 
complex cDNA probes prepared from the control and test cell mRNA populations. 
Those mRNAs which are differentially expressed in the treated cell population will 
show a positive signal only on the filter probed with cDNA from the treated cells. 
Furthermore, labelled cDNA from different test conditions can be used to probe 
multiple blots, thereby enabling the identification of mRNAs which are only up- 
regulated under certain conditions. For example, St John and Davis (1979) screened 
replica niters with acetate-, glucose- and galactose-derived probes in order to obtain 
genes induced specifically by galactose metabolism. Although groundbreaking in its 
time this method is now considered insensitive and time-consuming, as up to 2 
months are required to complete the identification of genes which are differentially 
expressed in the test population. In addition, there is no convenient way to check 
that the procedure has worked until the whole process has been completed. 

Subtractive Hybridization (SH) 

The developing concept of differential gene expression and the success of early 
approaches such as that described by St John and Davis (1979) soon gave rise to a 
search for more convenient methods of analysis. One of the first to be developed was 
SH, numerous variations of which have since been reported (see below). In general, 
this approach involves hybridization of mRNA /cDNA from one population (tester) 
to excess mRNA/cDNA from another (driver), followed by separation of the 
unhybridized tester fraction (differentially expressed) from the hybridized common 
sequences. This step has been achieved physically, chemically and through the use 
of selective polymerase chain reaction (PCR) techniques. 

Physical separation 

Original subtractive hybridization technology involved the physical separation 
of hybridized common species from unique single stranded species. Several methods 
of achieving this have been described, including hydroxyapatite chromatography 
(Sargent and Dawid 1983), avidin-biotin technology (Duguid and Dinauer 1990) 
and oligodT-latex separation (Hara et al. 1991). In the first approach, common 
mRNA species are removed by cDNA (from test cells)-mRNA (from control cells) 
subtractive hybridization followed by hydroxyapatite chromatography, as hydroxy- 
apatite specifically adsorbs the cDNA-mRNA hybrids. The unabsorbed cDNA is 
then used either for the construction of a cDNA library of differentially expressed 
genes (Sargent and Dawid 1983, Schneider et al. 1988) or directly as a probe to 
screen a preselected library (Zimmerman et al. 1980, Davis et al. 1984, Hedrick et al. 
1984). A schematic diagram of the procedure is shown in figure 1. 

Less rigorous physical separation procedures coupled with sensitivity enhancing 
PCR steps were later developed as a means to overcome some of the problems 
encountered with the hydroxyapatite procedure. For example, Daguid and Dinauer 
(1990) described a method of subtraction utilizing biotin- affinity systems as a means 
to remove hybridized common sequences. In this process, both the control and 
tester mRNA populations are first converted to cDN A and an adaptor (' oligovector ' , 
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Figure 1. The hydroxyapatite method of subtractive hybridization. cDNA derived from the 
treated /altered (tester) population is mixed with a large excess of mRNA from the control (driver) 
population. Following hybridization, mRNA-cDNA hybrids are removed by hydroxyapatite 
chromatography. The only cDNAs which remain arc those which are differentially expressed in 
the treated /altered population. In order to facilitate the recovery of full length clones, small cDNA 
fragments are removed by exclusion chromatography. The remaining cDNAs are then cloned into 
a vector for sequencing, or labelled and used directly to probe a library, as described by Sargent 
and Dawid (1983). 

containing a restriction site) ligated to both sides. Both populations are then 
amplified by PCR, but the driver cDNA population is subsequently digested with 
the adaptor-containing restriction endonuclease. This serves to cleave the oligo- 
vector and reduce the amplification potential of the control population. The digested 
control population is then biotinylated and an excess mixed with tester cDNA. 
Following denaturation and hybridization, the mix is applied to a biocytin column 
(streptavidin may also be used) to remove the control population, including 
heteroduplexes formed by annealing of common sequences from the tester 
population. The procedure is repeated several times following the addition of fresh 



660 



J. C. Rockett et al. 



Control (driver) mRNA 

AAAA 

-AAAA 

| Anneal mRNA to polydT^ latex beads 

•to 

IT 

AAAA 

^ cDNA synthesis 



Test (tester) mRNA 

1 — AAAA 

AAAA 



Mix and anneal 



J 



AAAA- 



AAAA • 



•to 



AAAA' 



4 

Centrifuge beads, collect and store supernatant, 
dissociate polyA, reapply supernatant 

AAAA Tester-specific mRNA retrieved after 

AAAA 4 rounds of hybridization 

i 

cDNA synthesis 

i 

Ligate adaptors and insert into vector 

i 

Sequence inserts and/or carry out 
other downstream applications 

Figure 2. The use of oligodT^ latex to perform subtractive hybridization. mRNA extracted from the 
control (driver) population is converted to anchored cDNA using polydT oligonucleotides 
attached to latex beads. mRNA from the treated/altered (tester) population is repeatedly 
hybridized against an excess of the anchored driver cDNA. The final population of mRNA is 
tester specific and can be converted into cDNA for cloning and other downstream applications, as 
described by Hara et al. (1991). 
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control cDNA. In order to further enrich those species differentially expressed in 
the tester cDNA, the subtracted tester population is amplified by PCR following 
every second subtraction cycle. After six cycles of subtraction (three reamplification 
steps) the reaction mix is ligated into a vector for further analysis. 

In a slightly different approach, Hara et al. (1991) utilized a method whereby 
oligo(dT 30 ) primers attached to a latex substrate are used to first capture mRNA 
extracted from the control population. Following 1st strand cDNA synthesis, the 
RNA strand of the heteroduplexes is removed by heat denaturation and centri- 
fugation (the cDNA-oligotex-dT^ forms a pellet and the supernatant is removed). 
A quantity of tester mRNA is then repeatedly hybridized to the immobilized control 
(driver) cDNA (which is present in 20-fold excess). After several rounds of 
hybridization the only mRNA molecules left in the tester mRNA population are 
those which are not found in the driver cDNA-oligotex-dT^ population. These 
tester-specific mRNA species are then converted to cDNA and, following the 
addition of adaptor sequences, amplified by PCR. The PCR products are then 
ligated into a vector for further analysis using restriction sites incorporated into the 
PCR primers. A schematic illustration of this subtraction process is shown in figure 
2. 

However, all these methods utilising physical separation have been described as 
inefficient due to the requirement for large starting amounts of mRNA, significant 
loss of material during the separation process and a need for several rounds of 
hybridization. Hence, new methods of differential expression analysis have recently 
been designed to eliminate these problems. 

Chemical Cross-Linking Subtraction (CCLS) 

In this technique, originally described by Hampson et al. (1992), driver mRNA 
is mixed with tester cDNA (1st strand only) in a ratio of > 20:1. The common 
sequences form cDNA :mRNA hybrids, leaving the tester specific species as single 
stranded cDNA . Instead of physically separating these hybrids, they are inactivated 
chemically using 2,5 diaziridinyl-1 ,4-benzoquinone (DZQ). Labelled probes are 
then synthesized from the remaining single stranded cDNA species (unreacted 
mRNA species remaining from the driver are not converted into probe material due 
to specificity of Sequenase T7 DNA polymerase used to make the probe) and used 
to screen a cDN A library made from the tester cell population. A schematic diagram 
of the system is shown in figure 3. 

It has been shown that the differentially expressed sequences can be enriched at 
least 300-fold with one round of subtraction (Hampson et al. 1992), and that the 
technique should allow isolation of cDNAs derived from transcripts that are present 
at less than 50 copies per cell. This equates to genes at the low end of intermediate 
abundance (see table 1). The main advantages of the CCLS approach are that it is 
rapid, technically simple and also produces fewer false positives than other 
differential expression analysis methods. However, like the physical separation 
protocols, a major drawback with CCLS is the large amount of starting material 
required (at least 10 ^g RNA). Consequently, the technique has recently been 
refined so that a renewable source of RNA can be generated. The degenerate random 
oligonucleotide primed (DROP) adaptation (Hampson et al. 1996, Hampson and 
Hampson 1997) uses random hexanucleotide sequences to prime solid phase- 
synthesized cDNA. Since each primer includes a T7 polymerase promotor sequence 
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Figure 3. Chemical cross-linking subtraction. Excess driver mRNA is mixed with 1 st strand tester 
cDNA. The common sequences form mRNA:cDNA hybrids which are cross linked with 2,5 
diaziridinyl-1,4-benzoquinone (DZQ) and the remaining cDNA sequences are differentially 
expressed in the tester population. Probes are made from these sequences using Sequenase 2.0 
DNA polymerase, which lacks reverse transcriptase activity and, therefore, does not react with the 
remaining mRNA molecules from the driver. The labelled probes are then used to screen a cDN A 
library for clones of differentially expressed sequences. Adapted from Walters al. (1996), with 
permission. 



Table 1. The abundance of mRNA species and classes in a typical mammalian cell. 
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Modified from Bertioli et al. (1995). 
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at the 5'end, the final pool of random cDNA fragments is a PCR-renewable cDNA 
population which is representative of the expressed gene pool and can be used to 
synthesize sense RNA for use as driver material. Furthermore, if the final pool of 
random cDNA fragments is reamplified using biotinylated T7 primer and random 
hexamer, the product can be captured with streptavidin beads and the antisense 
strand eluted for use as tester. Since both target and driver can be generated from 
the same DROP product, subtraction can be performed in both directions (i.e. for 
up- and down-regulated species) between two different DROP products. 

Representational Difference Analysis ( RDA ) 

RDA of cDNA (Hubank and Schatz 1994) is an extension of the technique 
originally applied to genomic DNA as a means of identifying differences between 
two complex genomes (Lisitsyn et al. 1993). It is a process of subtraction and 
amplification involving subtractive hybridization of the tester in the presence of 
excess driver. Sequences in the tester that have homologues in the driver are 
rendered unamplifiable, whereas those genes expressed only in the tester retain the 
ability to be amplified by PCR. The procedure is shown schematically in figure 4. 

In essence, the driver and tester mRNA populations are first converted to cDN A 
and amplified by PCR following the ligation of an adaptor. The adaptors are then 
removed from both populations and a new (different) adaptor ligated to the 
amplified tester population only. Driver and tester populations are next melted and 
hybridized together in a ratio of 100:1. Following hybridization, only tester : tester 
homohybrids have 5 'adaptors at each end of the DNA duplex and can, thus, be filled 
in at both 3' ends. Hence, only these molecules are amplified exponentially during 
the subsequent PCR step. Although tester: driver heterohybrids are present, they 
only amplify in a linear fashion, since the strand derived from the driver has no 
adaptor to which the primer can bind. Driver : driver heterohybrids have no 
adaptors and, therefore, are not amplified. Single stranded molecules are digested 
with mung bean nuclease before a further PCR-enrichment of the tester : tester 
homohybrids. The adaptors on the amplified tester population are then replaced and 
the whole process repeated a further two or three times using an increasing excess of 
driver (Hubank and Shatz used a tester :driver ratio of 1:400, 1:80000 and 
1 : 800000 for the second, third and fourth hybridizations, respectively). Different 
adaptors are ligated to the tester between successive rounds of hybridization and 
amplification to prevent the accumulation of PCR products that might interfere with 
subsequent amplifications. The final display is a series of differentially expressed 
gene products easily observable on an ethidium bromide gel. 

The main advantages of RDA are that it offers a reproducible and sensitive 
approach to the analysis of differentially expressed genes. Hubank and Schatz (1994) 
reported that they were able to isolate genes that were differentially expressed in 
substantially less than 1 % of the cells from which the tester is derived. Perhaps the 
main drawback is that multiple rounds of ligation, hybridization, amplifiation and 
digestion are required. The procedure is, therefore, lengthier than many other 
differential display approaches and provides more opportunity for operator-induced 
error to occur. Although the generation of false positives has been noted, this has 
been solved to some degree by O'Neill and Sinclair (1997) through the use of HPLC- 
purified adaptors. These are free of the truncated adaptors which appear to be a 
major source of the false positive bands. A very similar technique to RDA, termed 
linker capture subtraction (LCS) was described by Yang and Sytowski (1996). 
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Figure 4. The representational difference analysis (RDA) technique. Driver and tester cDNA arc 
digested with a 4-cutter restriction enzyme such as DpnU. The 1 st set of 12/24 adaptor strands 
(oligonucleotides) are ligated to each other and the digested cDNA products. The 12mcr is 
subsequently melted away and the 3'ends filled in using Taq DNA polymerase. Each cDNA 
population is then amplified using PCR, following which the 1 st set of adaptors is removed with 
DpnU. A second set of 12/24 adaptor strands is then added to the amplified tester cDNA 
population, after which the tester is hybridized against a large excess of driver. The 12mer 
adaptors arc melted and the 3'ends filled in as before. PCR is carried out with primers identical 
to the new 24mer adaptor. Thus, the only hybridization products which are exponentially 
amplified are those which are tester: tester combinations. Following PCR, ssDNA products are 
removed with mung bean nuclease, leaving the 'first difference product'. This is digested and a 
third set of 12/24 adaptors added before repeating the subtraction process from the hybridization 
stage. The process is repeated to the 3 rd or 4 th difference product, as described by Lisitsyn et al. 
(1993) and Hubank and Schatz (1994). 
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Suppression PCR Sub tractive Hybridization ( SSH) 

The most recent adaptation of the SH approach to differential expression 
analysis was first described by Diatchenko et al. (1996) and Gurskaya et al. (1996). 
They reported that a 1000-5000 fold enrichment of rare cDNAs (equivalent to 
isolating mRNAs present at only a few copies per cell) can be obtained without the 
need for multiple hybridizations/subtractions. Instead of physical or chemical 
removal of the common sequences, a PCR-based suppression system is used (see 
figure 5). 

In SSH, excess driver cDNA is added to two portions of the tester cDNA which 
have been ligated with different adaptors. A first round of hybridization serves to 
enrich differentially expressed genes and equalize rare and abundant messages. 
Equalization occurs since reannealing is more rapid for abundant molecules than for 
rarer molecules due to the second order kinetics of hybridization (James and Higgins 
1985). The two primary hybridization mixes are then mixed together in the presence 
of excess driver and allowed to hybridize further. This step permits the annealing of 
single stranded complementary sequences which did not hybridize in the primary 
hybridization, and in doing so generates templates for PCR amplification. Although 
there are several possible combinations of the single stranded molecules present in 
the secondary hybridization mix, only one particular combination (differentially 
expressed in the tester cDNA composed of complimentary strands having different 
adaptors) can amplify exponentially. 

Having obtained the final differential display, two options are available if cloning 
of cDNAs is desired. One is to transform the whole of the final PCR reaction into 
competent cells. Transformed colonies can then be isolated and their inserts 
characterized by sequencing, restriction analysis or PCR. Alternatively, the final 
PCR products can be resolved on a gel and the individual bands excised, reamplified 
and cloned. The first approach is technically simpler and less time consuming. 
However, ligation/transformation reactions are known to be biased towards the 
cloning of smaller molecules, and so the final population of clones will probably not 
contain a representative selection of the larger products. In addition, although 
equalization theoretically occurs, observations in this laboratory suggest that this is 
by no means perfectly accomplished. Consequently, some gene species are present 
in a higher number than others and this will be represented in the final population 
of clones. Thus, in order to obtain a substantial proportion of those gene species that 
actually demonstrate differential expression in the tester population, the number of 
clones that will have to be screened after this step may be substantial. The second 
approach is initially more time consuming and technically demanding. However, it 
would appear to offer better prospects for cloning larger and low abundance gel 
products. In addition, one can incorporate a screening step that differentiates 
different products of different sequences but of the same size (HA-staining, see 
later). In this way, a good idea of the final number of clones to be isolated and 
identified can be achieved. 

An alternative (or even complementary) approach is to use the final differential 
display reaction to screen a cDNA library to isolate full length clones for further 
characterization, or a DNA array (see later) to quickly identify known genes. SSH 
has been used in this laboratory to begin characterization of the short-term gene 
expression profiles of enzyme-inducers such as phenobarbital (Rockett et al. 1997) 
and Wy-14,643 (Rockett et al. unpublished observations). The isolation of 
differentially expressed genes in this manner enables the construction of a fingerprint 
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Figure 5. PCR-selcct cDNA subtraction. In the primary hybridization, an excess of driver cDNA is 
added to each tester cDN A population. The samples are heat denatured and allowed to hybridize 
for between 3 and 8 h. This serves two purposes : (1 ) to equalize rare and abundant molecules ; and 
(2) to enrich for differentially expressed sequences — cDNAs that are not differentially expressed 
form type c molecules with the driver. In the secondary hybridization, the two primary 
hybridizations are mixed together without denaturing. Fresh denatured driver can also be added 
at this point to allow further enrichment of differentially expressed sequences. Type e molecules 
are formed in this secondary hybridization which are subsequently amplified using two rounds of 
PCR. The final products can be visualized on an agarose gel, labelled directly or cloned into a 
vector for downstream manipulation. As described by Diatchenko et al. (1996) and Gurskaya 
et al. (1996), with permission. 
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Figure 6. Flow diagram showing method used in this laboratory to isolate and identify clones of genes 
which are differentially expressed in rat liver following short term exposure to the enzyme 
inducers, phenobarbital and Wy-14,643. 



of expressed genes which are unique to each compound and time/dose point. Such 
information could be useful in short-term characterization of the toxic potential of 
new compounds by comparing the gene-expression profiles they elicit with those 
produced by known inducers. Figure 6 shows a flow diagram of the method used to 
isolate, verify and clone differentially expressed genes, and figure 7 shows expression 
profiles obtained from a typical SSH experiment. Subsequent sub-cloning of the 
individual bands, sequencing and gene data base interrogation reveals many genes 
which are either up- or down-regulated by phenobarbital in the rat (tables 2 and 3). 

One of the advantages in using the SSH approach is that no prior knowledge is 
required of which specific genes are up/down-regulated subsequent to xenobiotic 
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Figure 7. SSH display patterns obtained from rat liver following 3-day treatment with WY-14,643 or 
phenobarbital. mRNA extracted from control and treated livers was used to generate the 
differential displays using the PCR-Select cDNA subtraction kit (Clontech). Lane: 1 — lkb 
ladder ; 2 — genes upregulated following Wy, 14-643 treatment; 3 — genes downregulated following 
Wy, 14-64 3 treatment; 4 — genes upregulated following phenobarbital treatment; 5 — genes 
downregulated following phenobarbital treatment; 6 — lkb ladder. Reproduced from Rockett et 
al. (1997), with permission. 

exposure, and an almost complete complement of genes are obtained. For example, 
the peroxisome proliferator and non-genotoxic hepatocarcinogen Wy, 14,643, up- 
regulates at least 28 genes and down-regulates at least 15 in the rat (a sensitive 
species) and produces 48 up- and 37 down-regulated genes in the guinea pig, a 
resistant species (Rockett, Swales, Esda and Gibson, unpublished observations). 
One of these genes, CD81, was up-regulated in the rat and down-regulated in the 
guinea pig following Wy-14,643 treatment. CD81 (alternatively named TAPA-1) is 
a widely expressed cell surface protein which is involved in a large number of cellular 
processes including adhesion, activation, proliferation and differentiation (Levy et 
al. 1998). Since all of these functions are altered to some extent in the phenomena 
of hepatomegaly and non-genotoxic hepatocarcinogenesis, it is intriguing, and 
probably mechanistically-relevant, that CD81 expression is differentially regulated 
in a resistant and susceptible species. However, the down-side of this approach is 
that the majority of genes can be sequenced and matched to database sequences, but 
the latter are predominantly expressed sequence tags or genes of completely 
unknown function, thus partially obscuring a realistic overall assessment of the 
critical genes of genuine biological interest. Notwithstanding the lack of complete 
funtional identification of altered gene expression, such gene profiling studies 
essentially provides a * molecular fingerprint' in response to xenobiotic challenge, 
thereby serving as a mechanistically-relevant platform for further detailed 
investigations. 

Differential Display (DD) 

Originally described as 'RNA fingerprinting by arbitrarily primed PCR' (Liang 
and Pardee 1992) this method is now more commonly referred to as 'differential 
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Tabic 2. 


Genes up-regulated in rat liver following 3-day exposure to phenobarbital. 


Band number 






(approximate 


Highest sequence 




size in bp) 


similarity 


FASTA-EMBL gene identification 


5 (1300) 


93.5% 


CYP2B1 


7 (1000) 


95.1 % 


Preproalbumin 






Sfrnm albumin mR\'A 

UC'l UJII tlJL/UJlllJI 111 1\ 1 > i\ 


8 (950) 


98.3% 


NCl-CGAP-Prl H. sapiens (EST) 


10(850) 


95.7% 


CYP2B1 


11 (800) 


Clone 1 94.9% 


CYP2B1 




Clone 2 75.3% 


CYP2B2 


12 (750) 


93.8% 


TRPM-2 mRNA 






Sulfated glycoprotein 


1 5 (600) 


92.9% 


Preproalbumin 






Serum albumin mRNA 


16(55) 


Clone 1 95.2% 


CYP2B1 




Clone 2 93.6% 


Haptoglobulin mRNA partial alpha 


21 (350) 


99.3% 


18S, 5.8S & 28S rRNa 



Bands 1-4, 6, 9, 13, 14, and 17-20 are shown to be false positives by dot blot anaylsis and, therefore, 
are not sequenced. Derived from Rockett et al. (1997). It should be noted that the above genes do not 
represent the complete spectrum of genes which are up-regulated in rat liver by phenobarbital, but 
simply represents the genes sequenced and identified to date. 



Table 3. Genes down -regulated in rat liver following 3-day exposure to phenobarbital. 



Band number 

(approximate Highest sequence 

size in bp) similarity FASTA-EMBL gene identification 



1 (1500) 




95.3% 


3-oxoacyl-CoA thiolase 


2 (1200) 




92.3% 


Hcmopoxin mRNA 


. 3 (1000) 




91.7% 


Alpha-2u-globulin mRNA 


7 (700) 


Clone 1 


77.2% 


M.musculus CI inhibitor 




Clone 2 


94.5% 


Electron transfer flavoprotein 




Clone 3 


91.0% 


M. musculus Topoisomerase 1 (Topo 1) 


8 (650) 


Clone 1 


86.9% 


Soares 2NbMT Af . musculus (EST) 




Clone 2 


96.2% 


Alpha-2u-globulin (s-type) mRNA 


9 (600) 


Clone 1 


86.9% 


Soares mouse NML M. musculus (EST) 




Clone 2 


82.0% 


Soares p3NMF 19.5 M. musculus (EST) 


10 (550) 




73.8% 


Soares mouse NML M. musculus (EST) 


11 (525) 




95.7% 


NCl-CGAP-Prl H. sapiens (EST) 


12 (375) 




100.0% 


Ribosomal protein 


13 (23) 


Clone 1 


97.2% 


Soares mouse embryo NbME135 (EST) 




Clone 2 


100.0% 


Fibrinogen B-beta-chain 




Clone 3 


100.0% 


Apolipoprotein E gene 


14 (170) 




96.0% 


Soares p3NMF19.5 M. musculus (EST) 


15 (140) 




97.3% 


Stratagenc mouse testis (EST) 


Others: (300) 




96.7% 


R, norvegkus RASP 1 mRNA 


(275) 




93.1% 


Soares mouse mammary gland (EST) 



EST = Expressed sequence tag. Bands 4-6 were shown to be false positives by dot blot analysis and, 
therefore, were not sequenced. Derived from Rockett a/. (1997). It should be noted that the above genes 
do not represent the complete spectrum of genes which are down-regulated in rat liver by phenobarbital, 
but simiply represents the genes sequenced and identified to date. 



display* (DD). In this method, all the mRNA species in the control and treated cell 
populations are amplified in separate reactions using reverse transcriptase-PCR 
(RT-PCR). The products are then run side-by-side on sequencing gels. Those 
bands which are present in one display only, or which are much more intense in one 
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display compared to the other, are differentially expressed and may be recovered for 
further characterization. One advantage of this system is the speed with which it can 
be carried out — 2 days to obtain a display and as little as a week to make and identify 
clones. 

Two commonly used variations are based on different methods of priming the 
reverse transcription step (figure 8). One is to use an oligo dT with a 2-base 'anchor' 
at the 3'-end, e.g. 5' (dT n )CA 3' (Liang and Pardee 1992). Alternatively, an 
arbitrary primer may be used for 1st strand cDNA synthesis (Welsh et al. 1992). 
This variant of RNA fingerprinting has also been called 'RAP' (RNA Arbitrarily 
Primed)-PCR. One advantage of this second approach is that PCR products may be 
derived from anywhere in the RNA, including open reading frames. In addition, it 
can be used for mRNAs that are not polyadenylated, such as many bacterial mRNAs 
(Wong and McClelland 1994). In both cases, following reverse transcription and 
denaturation, second strand cDNA synthesis is carried out with an arbitrary primer 
(arbitrary primers have a single base at each position, as compared to random 
primers, which contain a mixture of all four bases at each position). The resulting 
PCR, thus, produces a series of products which, depending on the system (primer 
length and composition, polymerase and gel system), usually includes 50-100 
products per primer set (Band and Sager 1989). When a combination of different 
dT-anchors and arbitrary primers are used, almost all mRNA species from a cell can 
be amplified. When the cDNA products from two different populations are analysed 
side by side on a polyacrylamide gel, differences in expression can be identified and 
the appropriate bands recovered for cloning and further analysis. 

Although DD is perhaps the most popular approach used today for identifying 
differentially expressed genes, it does suffer from several perceived disadvantages: 

(1) It may have a strong bias towards high copy number mRNAs (Bertioli et al. 
1995), although this has been disputed (Wane/ al. 1996) and the isolation of very 
low abundance genes may be achieved in certain circumstances (Guimeraes et 
al. 1995a). 

(2) The cDNAs obtained often only represent the extreme 3' end of the mRNA 
(often the 3 '-untranslated region), although this may not always be the case 
(Guimeraes et al. 1995a). Since the 3'end is often not included in Genbank and 
shows variation between organisms, cDNAs identified by DD cannot always be 
matched with their genes, even if they have been identified. 

(3) The pattern of differential expression seen on the display often cannot be 
reproduced on Northern blots, with false positives arising in up to 70% of cases 
(Sun et al. 1994). Some adaptations have been shown to reduce false positives, 
including the use of two reverse transcriptases (Sung and Denman 1997), 
comparison of uninduced and induced cells over a time course (Burn et al. 1994) 
and comparison of DDPCR-products from two uninduced and two induced 
lines (Sompayrac et al. 1995). The latter authors also reported that the use of 
cytoplasmic RNA rather then total RNA reduces false positives arising from 
nuclear RNA that is not transported to the cytoplasm. 

Further details of the background, strengths and weaknesses of the DD 
technique can be obtained from a review by McClelland et al. (1996) and from 
articles by Liang et al. (1995) and Wan et al. (1996). 
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1 st strand cDNA 
4 AC 



1 * strand cDNA 
< 



-UGAAAAAAA 



-AAAAAAA 



I 



Denature and synthesise 2 nd strand 
with any arbitrary primer ( 



2 nd strand cDNA 



-AC 



2 nd strand cDNA 
► 



cDNA can now be amplified by PCR using original primer pair 

Figure 8. Two approaches to differential display (DD) analysis. 1 st strand synthesis can be carried out 
either with a polydT n NN primer (where N = G, C or A) or with an arbitrary primer. The use of 
diff erent combinations of G , C and A to anchor the first strand polydT primer enables the priming 
of the majority of polyadenylated mRNAs. Arbitrary primers may hybridize at none, one or more 
places along the length of the mRNA, allowing l sl strand cDNA synthesis to occur at none, one 
or more points in the same gene. In both cases, 2 nd strand synthesis is carried out with an arbitrary 
primer. Since these arbitrary primers for the 2 nd strand may also hybridize to the l sl strand cDNA 
in a number of diff erent places, several different 2 nd strand products may be obtained from one 
binding point of the 1 st strand primer. Following 2 nd strand synthesis, the original set of primers 
is used to amplify the second strand products, with the result that numerous gene sequences are 
amplified. 

Restriction endonuclease-facilitated analysis of gene expression 

Serial Analysis of Gene Expression ( SAGE) 

A more recent development in the field of differential display is SAGE analysis 
(Velculescu et al. 1995). This method uses a different approach to those discussed so 
far and is based on two principles. Firstly, in more than 95% of cases, short 
nucleotide sequences ('tags') of only nine or 10 base pairs provide sufficient 
information to identify their gene of origin. Secondly, concatenation (linking 
together in a series) of these tags allows sequencing of multiple cDNAs within a 
single clone. Figure 9 shows a schematic representation of the SAGE process. In this 
procedure, double stranded cDNA from the test cells is synthesized with a 
biotinylated polydT primer. Following digestion with a commonly cutting (4bp 
recognition sequence) restriction enzyme ('anchoring enzyme'), the 3' ends of the 
cDNA population are captured with streptavidin beads. The captured population is 
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split into two and different adaptors ligated to the 5 'ends of each group. Incorporated 
into the adaptors is a recognition sequence for a type IIS restriction enzyme — one 
which cuts DNA at a defined distance (< 20 bp) from its recognition sequence. 
Hence, following digestion of each captured cDNA population with the IIS enzyme, 
the adaptors plus a short piece of the captured cDNA are released. The two 
populations are then ligated and the products amplified. The amplified products are 
cleaved with the original anchoring enzyme, religated (concatomers are formed in 
the process) and cloned. The advantage of this system is that hundreds of gene tags 
can be identified by sequencing only a few clones. Furthermore, the number of times 
a given transcript is identified is a quantitative measurement of that gene's 
abundance in the original population, a feature which facilitates identification of 
differentially expressed genes in different cell populations. 

Some disadvantages of SAGE analysis include the technical difficulty of the 
method, a large amount of accurate sequencing is required, biased towards abundant 
mRNAs, has not been validated in the pharmaco/toxicogenomic setting and has 
only been used to examine well known tissue differences to date. 

Gene Expression Fingerprinting (GEF) 

A different capture/restriction digest approach for isolating differentially 
expressed genes has been described by Ivanova and Belyavsky (1995). In this 
method, RNA is converted to cDNA using biotinylated oligo(dT) primers. The 
cDNA population is then digested with a specific endonuclease and captured with 
magnetic streptavidin microbeads to facilitate removal of the unwanted 5' digestion 
products. The use of restricted 3 '-ends alone serves to reduce the complexity of the 
cDNA fragment pool and helps to ensure that each RNA species is represented by 
not more than one restriction product. An adaptor is ligated to facilitate subsequent 
amplification of the captured population. PCR is carried out with one adaptor- 
specific and one biotinylated polydT primer. The reamplified population is 
recaptured and the non-biotinylated strands removed by alkaline dissociation. The 
non-biotinylated strand is then resynthesized using a different adaptor-specific 
primer in the presence of a radiolabeled dNTP. The labelled immobilized 3' cDNA 
ends are next sequentially treated with a series of different restriction endonucleases 
and the products from each digestion analysed by PAGE. The result is a fingerprint 
composed of a number of ladders (equal to the number of sequential digests used). 
By comparing test versus control fingerprints, it is possible to identify differentially 
expressed products which can then be isolated from the gel and cloned. The 
advantages of this procedure are that it is very robust and reproducible, and the 
authors estimate that 80-93% of cDNA molecules are involved in the final 
fingerprint. The disadvantage is that polyacrylamide gels can rarely resolve more 
than 300-400 bands, which compares poorly to the 1000 or more which are 
estimated to be produced in an average experiment. The use of 2-D gels such as 
those described by Uitterlinden et aL (1989) and Hatada.it aL (1991) may help to 
overcome this problem. 

A similar method for displaying restriction endonuclease fragments was later 
described by Prashar and Weissman (1996). However, instead of sequential 
digestion of the immobolized 3'-terminal cDNA fragments, these authors simply 
compared the profiles of the control and treated populations without further 
manipulation. 
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Figure 9. Serial analysis of gene expression (SAGE) analysis. cDN A is cleaved with an anchoring enzyme 
(AE)and the 3'ends captured using streptavidin beads. ThccDNA pool is divided in half and each 
portion ligated to a different linker, each containing a type IIS restriction site (tagging enzyme, 
TE). Restriction with the type IIS enzyme releases the linker plus a short length of cDNA 
(XXXXX and OOOOO indicate nucleotides of different tags). The two pools of tags are then 
ligated and amplified using linker-specific primers. Following PGR, the products are cleaved with 
the AE and the ditags isolated from the linkers using PAGE. The ditags are then ligated (during 
which process, concatenation occurs) and cloned into a vector of choice for sequencing. After 
Velculescu et al. (1995), with permission. 
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DNA arrays 

'Open' differential display systems are cumbersome in that it takes a great deal 
of time to extract and identify candidate genes and then confirm that they are indeed 
up- or down-regulated in the treated compared to the control tissue. Normally, the 
latter process is carried out using Northern blotting or RT-PCR. Even so, each of 
the aforementioned steps produce a bottleneck to the ultimate goal of rapid analysis 
of gene expression. These problems will likely be addressed by the development of 
so-called DNA arrays (e.g. Gress et al. 1992, Zhao et al. 1995, Schena et al. 1996), 
the introduction of which has signalled the next era in differential gene expression 
analysis. DNA arrays consist of a gridded membrane or glass 'chips' containing 
hundreds or thousands of DNA spots, each consisting of multiple copies of part of 
a known gene. The genes are often selected based on previously proven involvement 
in oncogenesis, cell cycling, DNA repair, development and other cellular processes. 
They are usually chosen to be as specific as possible for each gene and animal species. 
Human and mouse arrays are already commercially available and a few companies 
will construct a personalized array to order, for example Clontech Laboratories and 
Research Genetics Inc. The technique is rapid in that hundreds or even thousands 
of genes can be spotted on a single array, and that mRNA/cDNA from the test 
populations can be labelled and used directly as probe. When analysed with 
appropriate hardware and software, arrays offer a rapid and quantitative means to 
assess differences in gene expression between two cell populations. Of course, there 
can only be identification and quantitation of those genes which are in the array 
(hence the term 'closed' system). Therefore, one approach to elucidating the 
molecular mechanisms involved in a particular disease/development system may be 
to combine an open and closed system — a DNA array to directly identify and 
quantitate the expression of known genes in mRNA populations, and an open 
system such as SSH to isolate unknown genes which are differentially expressed. 

One of the main advantages of DNA arrays is the huge number of gene fragments 
which can be put on a membrane — some companies have reported gridding up to 
60000 spots on a single glass 'chip' (microscope slide). These high density chip- 
based micro-arrays will probably become available as mass-produced off-the-shelf 
items in the near future. This should facilitate the more rapid determination of 
differential expression in time and dose-response experiments. Aside from their 
high cost and the technical complexities involved in producing and probing DNA 
arrays, the main problem which remains, especially with the newer micro-array 
(gene-chip) technologies, is that results are often not wholly reproducible between 
arrays. However, this problem is being addressed and should be resolved within the 
next few years. 



EST databases as a means to identify differentially expressed genes 

Expressed sequence tags (ESTs) are partial sequences of clones obtained from 
cDNA libraries. Even though most ESTs have no formal identity (putative 
identification is the best to be hoped for), they have proven to be a rapid and efficient 
means of discovering new genes and can be used to generate profiles of gene- 
expression in specific cells. Since they were first described by Adams et al. (1991), 
there has been a huge explosion in EST production and it is estimated that there are 
now well over a million such sequences in the public domain, representing over half 
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of all human genes (Hillier et al. 1996). This large number of freely available 
sequences (both sequence information and clones are normally available royalty-free 
from the originators) has enabled the development of a new approach towards 
differential gene expression analysis as described by Vasmatzis et al. (1998). The 
approach is simple in theory: EST databases are first searched for genes that have a 
number of related EST sequences from the target tissue of choice, but none or few 
from non-target tissue libraries. Programmes to assist in the assembly of such sets of 
overlapping data may be developed in-house or obtained privately or from the 
internet. For example, the Institute for Genomic Research (TIGR, found at 
http:/ /www. tigr.org) provides many software tools free of charge to the scientific 
community. Included amongst these is the TIGR assembler (Sutton et al. 1995), a 
tool for the assembly of large sets of overlapping data such as ESTs, bacterial 
artificial chromosomes (BAC)s, or small genomes. Candidate EST clones repre- 
senting different genes are then analysed using RN A blot methods for size and tissue 
specificity and, if required, used as probes to isolate and identify the full length 
cDNA clone for further characterization. In practice however, the method is rather 
more involved, requiring bioinformatic and computer analysis coupled with 
confirmatory molecular studies. Vasmatzis et al. (1998) have described several 
problems in this fledgling approach, such as separating highly homologous 
sequences derived from different genes and an overemphasis of specificity for some 
EST sequences. However, since these problems will largely be addressed by the 
development of more suitable computer algorithms and an increased completeness 
of the EST database, it is likely that this approach to identifying differentially 
expressed genes may enjoy more patronage in the future. 



Problems and potential of differential expression techniques 

The holistic or single cell approach ? 

When working with in vivo models of differential expression, one of the first 
issues to consider must be the presence of multiple cell types in any given specimen. 
For example, a liver sample is likely to contain not only hepatocytes, but also 
(potentially) Ito cells, bile ductule cells, endothelial cells, various immune cells (e.g. 
lymphocytes, macrophages and Kupffer cells) and fibroblasts. Other tissues will 
each have their own distinctive cell populations. Also, in the case of neoplastic tissue, 
there are almost always normal, hyperplastic and/or dysplastic cells present in a 
sample. One must, therefore, be aware that genes obtained from a differential 
display experiment performed on an animal tissue model may not necessarily arise 
exclusively from the intended 'target' cells, e.g. hepatocytes/neoplastic cells. If 
appropriate, further analyses using immunohistochemistry, in situ hybridization or 
in situ RT-PCR should be used to confirm which cell types are expressing the 
gene(s) of interest. This problem is probably most acute for those studying the 
differential expression of genes in the development of different cell types, where 
there is a need to examine homologous cell populations. The problem is now being 
addressed at the National Cancer Institute (Bethesda, MD, USA) where new micro- 
disection techniques have been employed to assist in their gene analysis programme, 
the Cancer Genome Anatomy Project (CG AP) (For more information see web site : 
http:/ /www. ncbi.nlm.nih.gov/ncicgap/intro. html). There are also separation tech- 
niques available that utilise cell-specific antigens as a means to isolate target cells, 
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e.g. fluorescence activated cell sorting (FACS) (Dunbar et al. 1998, Kas-Deelen et 
al. 1998) and magnetic bead technology (Richard et al. 1998, Rogler et al. 1998). 

However, those taking a holistic approach may consider this issue unimportant. 
There is an equally appropriate view that all those genes showing altered expression 
within a compromized tissue should be taken into consideration. After all, since all 
tissues are complex mixes of different, interacting cell types which intimately 
regulate each other's growth and development, it is clear that each cell type could in 
some way contribute (positively or negatively) towards the molecular mechanisms 
which lie behind responses to external stimuli or neoplastic growth. It is perhaps 
then more informative to carry out differential display experiments using in vivo as 
opposed to in vitro models, where uniform populations of identical cells probably 
represent a partial, skewed or even inaccurate picture of the molecular changes that 
occur. 

The incidence and possible implications of inter-individual biological variation 
should be considered in any approach where whole animal models are being used. It 
is clear that individuals (humans and animals) respond in different ways to identical 
stimuli. One of the best characterized examples is the debrisoquine oxidation 
polymorphism, which is mediated by cytochrome CYP2D6 and determines the 
pharmacokinetics of many commonly prescribed drugs (Lennard 1993, Meyer and 
Zanger 1997). The reasons for such differences are varied and complex, but allelic 
variations, regulatory region polymorphisms and even physical and mental health 
can all contribute to observed differences in individual responses. Careful thought 
should, therefore, be given to the specific objectives of the study and to the possible 
value of pooling starting material (tissue/mRNA). The effect of this can be 
beneficial through the ironing out of exaggerated responses and unimportant minor 
fluctuations of (mechanistically) irrelevant genes in individual animals, thus 
providing a clearer overall picture of the general molecular mechanisms of the 
response. However, at the same time such minor variations may be of utmost 
importance in deciding the ability of individual animals to succumb to or resist the 
effects of a given chemical/disease. 



How efficient are differential expression techniques at recovering a high percentage of 
differentially expressed genes? 

A number of groups have produced experimental data suggesting that mam- 
malian cells produce between 8000-15 000 different mRNA species at any one time 
(Mechler and Rabbitts 1981, Hedrick et al. 1984, Bravo 1990), although figures as 
high as 20-30000 have also been quoted (Axel et al. 1976). Hedrick et al (1984) 
provided evidence suggesting that the majority of these belong to the rare abundance 
class. A breakdown of this abundance distribution is shown in table 1. 

When the results of differential display experiments have been compared with 
data obtained previously using other methods, it is apparent that not all differentially 
expressed mRNAs are represented in the final display. In particular, rare messages 
(which, importantly, often include regulatory proteins) are not easily recovered 
using differential display systems. This is a major shortcoming, as the majority of 
mRNA species exist at levels of less than 0.005% of the total population (table 1). 
Bertioli et al. (1995) examined the efficiency of DD templates (heterogeneous 
mRNA populations) for recovering rare messages and were unable to detect mRNA 
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species present at less than 1.2% of the total mRNA population — equivalent to an 
intermediate or abundant species. Interestingly, when simple model systems (single 
target only) were used instead of a heterogeneous mRNA population, the same 
primers could detect levels of target mRNA down to 10000X smaller. These results 
are probably best explained by competition for substrates from the many PCR 
products produced in a DD reaction. 

The numbers of differentially expressed mRN As reported in the literature using 
various model systems provides further evidence that many differentially expressed 
mRNAs are not recovered. For example, DeRisi et al. (1997) used DNA array 
technology to examine gene expression in yeast following exhaustion of sugar in the 
medium, and found that more than 1700 genes showed a change in expression of at 
least 2-fold. In light of such a finding, it would not be unreasonable to suggest that 
of the 8000-15 000 different mRNA species produced by any given mammalian cell, 
up to 1000 or more may show altered expression following chemical stimulation. 
Whilst this may be an extreme figure, it is known that at least 100 genes are 
activated/upregulated in Jurkat (T-) cells following IL-2 stimulation (Ullman et al. 
1990). In addition, Wan et al. (1996) estimated that interferon- /-stimulated HeLa 
cells differentially express up to 433 genes (assuming 24000 distinct mRNAs 
expressed by the cells). However, there have been few publications documenting 
anywhere near the recovery of these numbers. For example, in using DD to compare 
normal and regenerating mouse liver, Bauer et al. (1993) found only 70 of 38000 
total bands to be different. Of these, 50% (35 genes) were shown to correspond to 
differentially expressed bands. Chen et al. (1996) reported 10 genes upregulated in 
female rat liver following ethinyl estradiol treatment. McKenzie and Drake (1997) 
identified 14 different gene products whose expression was altered by phorbol 
myristate acetate (PMA, a tumour promoter agent) stimulation of a human 
myelomonocytic cell line. Kilty and Vickers (1997) identified 10 different gene 
products whose expression was upregulated in the peripheral blood leukocytes of 
allergic disease sufferers. Linskens et al. (1995) found 23 genes differentially 
expressed between young and senescent fibroblasts. Techniques other than DD 
have also provided an apparent paucity of differentially expressed genes. Using SH 
for example, Cao et al. (1997) found 15 genes differentially expressed in colorectal 
cancer compared to normal mucosal epithelium. Fitzpatrick et al. (1995) isolated 17 
genes upregulated in rat liver following treatment with the peroxisome proliferator, 
clofibrate; Philips et al. (1990) isolated 12 cDNA clones which were upregulated in 
highly metastatic mammary adenocarcinoma cell lines compared to poorly meta- 
static ones. Prashar and Weissman (1996) used 3' restriction fragment analysis and 
identified approximately 40 genes showing altered expression within 4 h of 
activation of Jurkat T-cells. Groenink and Leegwater (1996) analysed 27 gene 
fragments isolated using SSH of delayed early response phase of liver regeneration 
and found only 12 to be upregulated. 

In the laboratory, SSH was used to isolate up to 70 candidate genes which appear 
to show altered expression in guinea pig liver following short-term treatment with 
the peroxisome proliferator, WY-14,643 (Rockett, Swales, Esdaile and Gibson, 
unpublished observations). However, these findings have still to be confirmed by 
analysis of the extracted tissue mRNA for differential expression of these sequences. 

Whilst the latest differential display technologies are purported to include design 
and experimental modifications to overcome this lack of efficiency (in both the total 
number of differentially expressed genes recovered and the percentage that are true 
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positives), it is still not clear if such adaptations are practically effective — proving 
efficiency by spiking with a known amount of limited numbers of artificial 
construct(s) is one thing, but isolating a high percentage of the rare messages already 
present in an mRNA population is another. Of course, some models will genuinely 
produce only a small number of differentially expressed genes. In addition, there are 
also technical problems that can reduce efficiency. For example, mRNAs may have 
an unusual primary structure that effectively prevents their amplification by PCR- 
based systems. In addition, it is known that under certain circumstances not all 
mRNAs have 3 'poly A sites. For example, during Xenopus development, deadenyl- 
ation is used as a means to stabilize RNAs (Voeltz and Steitz 1998), whilst 
preferential deadenylation may play a role in regulating Hsp70 (and perhaps, 
therefore, other stress protein) expression in Drosophila (Dellavalle et al. 1994). The 
presence of deadenylated mRNAs would clearly reduce the efficiency of systems 
utilizing a polydT reverse transcription step. The efficiency of any system also 
depends on the quality of the starting material. All differential display techniques 
use mRNA as their target material. However, it is difficult to isolate mRNA that is 
completely free of ribosomal RNA. Even if polydT primers are used to prime first 
strand cDNA synthesis, ribosomal RNA is often transcribed to some degree 
(Clontech PCR-Select cDNA Subtraction kit user manual). It has been shown, at 
least in the case of SSH, that a high rRNA:mRNA ratio can lead to inefficient 
subtractive hybridization (Clontech PCR-Select cDNA Subtraction kit user 
manual), and there is no reason to suppose that it will not do likewise in other SH 
approaches. Finally, those techniques that utilise a presubtraction amplification step 
(e.g. RDA) may present a skewed representation since some sequences amplify 
better than others. 

Of course, probably the most important consideration is the temporal factor. It 
is clear that any given differential display experiment can only interrogate a cell at 
one point in time. It may well be that a high percentage of the genes showing altered 
expression at that time are obtained. However, given that disease processes and 
responses to environmental stimuli involve dynamic cascades of signalling, 
regulation, production and action, it is clear that all those genes which are switched 
on /off at different times will not be recovered and, therefore, vital information may 
well be missed. It is, therefore, imperative to obtain as much information about the 
model system beforehand as possible, from which a strategy can be derived for 
targeting specific time points or events that are of particular interest to the 
investigator. One way of getting round this problem of single time point analysis is 
to conduct the experiment over a suitable time course which, of course, adds 
substantially to the amount of work involved. 



How sensitive are differential expression technologies? 

There has been little published data that addresses the issue of how large the 
change in expression must be for it to permit isolation of the gene in question with 
the various differential expression technologies. Although the isolation of genes 
whose expression is changed as little as 1.5-fold has been reported using SSH 
(Groenink and Leegwater 1996), it appears that those demonstrating a change in 
excess of 5-fold are more likely to be picked up. Thus, there is a 'grey zone' 
in between where small changes could fade in and out of isolation between 
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experiments and animals. DD, on the other hand, is not subject to this grey 
zone since, unlike SH approaches, it does not amplify the difference in expression 
between two samples. Wan et al. (1996) reported that differences in expression of 
twofold or more are detectable using DD. 

Resolution and visualization of differential expression products 

It seems highly improbable with current technology that a gel system could be 
developed that is able to resolve all gene species showing altered expression in any 
given test system (be it SH- or DD-based). Polyacrylamide gel electrophoresis 
(PAGE) can resolve size differences down to 0.2% (Sambrook et al. 1989) and are 
used as standard in DD experiments. Even so, it is clear that a complex series of gene 
products such as those seen in a DD will contain unresolvable components. Thus, 
what appears to be one band in a gel may in fact turn out to be several. Indeed, it has 
been well documented (Mathieu-Daude et al. 1996, Smith et al. 1997) that a single 
band extracted from a DD often represents a composite of heterogeneous products, 
and the same has been found for SSH displays in this laboratory (Rockett et al. 
1997). One possible solution was offered by Mathieu-Daude et al. (1996), who 
extracted and reamplified candidate bands from a DD display and used single strand 
conformation polymorphism (SSCP) analysis to confirm which components 
represented the truly differentially expressed product. 

Many scientists often try to avoid the use of PAGE where possible because it is 
technically more demanding than agarose gel electrophoresis (AGE). Unfortunately, 
high resolution agarose gels such as Metaphor (FMC, Lichfield, UK) and AquaPor 
HR (National Diagnostics, Hessle, UK), whilst easier to prepare and manipulate 
than PAGE, can only separate DNA sequences which differ in size by around 
1.5-2% (15-20 base pairs for a 1Kb fragment). Thus, SSH, RDA or other such 
products which differ in size by less than this amount are normally not resolvable. 
However, a simple technique does in fact exist for increasing the resolving power of 
AGE — the inclusion of HA-red (10-phenyl neutral red-PEG ligand) or HA-yellow 
(bisbenzamide-PEG ligand) (Hanse Analytik GmbH, Bremen, Germany) in a 
gel separates identical or closely sized products on base content. Specifically, 
HA-red and -yellow selectively bind to GC and AT DNA motifs, respectively 
(Wawer et al. 1995, Hanse Analytik 1997, personal communication). Since both 
HA-stains possess an overall positive charge, they migrate towards the cathode 
when an electric field is applied. This is in direct opposition to DNA, which 
is negatively charged and, therefore, migrates towards the anode. Thus, if two 
DNA clones are identical in size (as perceived on a standard high resolution 
agarose gel), but differ in AT/GC content, inclusion of a HA-dye in the gel 
will effectively retard the migration of one of the sequences compared to the 
other, effectively making it apparently larger and, thus, providing a means of 
differentiating between the two. The use of HA-red has been shown to resolve 
sequences with an AT variation of less than 1% (Wawer et al. 1995), whilst Hanse 
Analytik have reported that HA staining is so sensitive that in one case it was used 
to distinguish two 567bp sequences which differed by only a single point mutation 
(Hanse Analytik 1996, personal communication). Therefore, if one wishes to check 
whether all the clones produced from a specific band in a differential display 
experiment are derived from the same gene species, a small amount of reamplified 
or digested clone can be run on a standard high resolution gel, and a second aliquot 
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Figure 10. Discrimination of clones of identical/nearly identical size using H A-red. Bands of decreasing 
size (1-5) were extracted from the final display of a suppression subtractive hybridization 
experiment and cloned. Seven colonies were picked at random from each cloned band and their 
inserts amplified using PCR. The products were run on two gels, (A) a high resolution 2% agarose 
gel, and (B)a high resolution 2% agarose gel containing 1 U/ml H A-red. With few exceptions, all 
the clones from each band appear to be the same size (gel A). However, the presence of HA-red 
(gel B), which separates identically-sized DNA fragments based on the percentage of GC within 
the sequence, clearly indicates the presence of different gene species within each band. For 
example, even though all five re-amplified clones of band 1 appear to be the same size, at least four 
different gene species are represented. 

in a similar gel containing one of the HA-stains. The standard gel should indicate 
any gross size differences, whilst the HA-stained gel should separate otherwise 
unresolvable species (on standard AGE) according to their base content. Geisinger 
et al. (1997) reported successful use of this approach for identifying DD-derived 
clones. Figure 10 shows such an experiment carried out in this laboratory on clones 
obtained from a band extracted from an SSH display. 

An alternative approach is to carry out a 2-D analysis of the differential display 
products. In this approach, size-based separation is first carried out in a standard 
agarose gel. The gel slice containing the display is then extracted and incorporated 
in to a HA gel for resolution based on AT/GC content. 

Of course, one should always consider the possibility of there being different 
gene species which are the same size and have the same GC/AT content. However, 
even these species are not unresolvable given some effort — again, one might use 
SSCP, or perhaps a denaturing gradient gel electrophoresis (DGGE) or temperature 
gradient field electrophoresis (TGGE) approach to resolve the contents of a band, 
either directly on the extracted band (Suzuki et al. 1991) or on the reamplifled 
product. 

The requirement of some differential display techniques to visualize large 
numbers of products (e.g. DD and GEF) can also present a problem in that, in terms 
of numbers, the resolution of PAGE rarely exceeds 300-400 bands. One approach to 
overcoming this might be to use 2-D gels such as those described by Uitterlinden et 
al (1989) and Hatada et al. (1991). 
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Extraction of differentially expressed bands from a gel can be complex since, in 
some cases (e.g. DD, GEF), the results are visualized by autoradiographic means, 
such that precise overlay of the developed film on the gel must occur if the correct 
band is to be extracted for further analysis. Clearly, a misjudged extraction can 
account for many man-hours lost. This problem , and that of the use of radioisotopes, 
has been addressed by several groups. For example, Lohmann et aL (1995) 
demonstrated that silver staining can be used directly to visualize DD bands in 
horizontal PAGs. An et al. (1996) avoided the use of radioisotopes by transferring a 
small amount (20-30%) of the DNA from their DD to a nylon membrane, and 
visualizing the bands using chemiluminescent staining before going back to extract 
the remaining DNA from the gel. Chen and Peck (1996) went one step further and 
transferred the entire DD to a nylon membrane. The DNA bands were then 
visualized using a digoxigenin (DIG) system (DIG was attached to the polydT 
primers used in the differential display procedure). Differentially expressed bands 
were cut from the membrane and the DNA eluted by washing with PCR buffer prior 
to reamplifkation. 

One of the advantages of using techniques such as SSH and RDA is that the final 
display can be run on an agarose gel and the bands visualized with simple ethidium 
bromide staining. Whilst this approach can provide acceptable results, overstaining 
with SYBR Green I or SYBR Gold nucleic acid stains (FMC) effectively enhances 
the intensity and sharpness of the bands. This greatly aids in their precise extraction 
and often reveals some faint products that may otherwise be overlooked. Whilst 
differential displays stained with SYBR Green I are better visualized using short 
wavelength UV (254 nm) rather than medium wavelength (306 nm), the shorter 
wavelength is much more DNA damaging. In practice, it takes only a few seconds 
to damage DNA extracted under 254 nm irradiation, effectively preventing 
^amplification and cloning. The best approach is to overstain with SYBR Green I 
and extract bands under a medium wavelength UV transillumination. 

The possible use of 'microfingerprinting' to reduce complexity 

Given the sheer number of gene products and the possible complexity of each 
band, an alternative approach to rapid characterization may be to use an enhanced 
analysis of a small section of a differential display — a 'sub-fingerprint' or * micro- 
fingerprint'. In this case, one could concentrate on those bands which only appear 
in a particular chosen size region. Reducing the fingerprint in this way has at least 
two advantages. One is that it should be possible to use different gel types, 
concentrations and run times tailored exactly to that region. Currently, one might 
run products from 100-3000 + bp on the same gel, which leads to compromize in the 
gel system being used and consequently to suboptimal resolution, both in terms of 
size and numbers, and can lead to problems in the accurate excision of individual 
bands. Secondly, it may be possible to enhance resolution by using a 2-D analysis 
using a HA-stain, as described earlier. In summary, if a range of gene product sizes 
is carefully chosen to included certain ' relevant ' genes, the 2-D system standardized, 
and appropriate gene analysis used, it may be possible to develop a method for the 
early and rapid identification of compounds which have similar or widely different 
cellular effects. If the prognosis for exposure to one or more other chemicals which 
display a similar profile is already known, then one could perhaps predict similar 
effects for any new compounds which show a similar micro-fingerprint. 
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An alternative approach to microfingerprinting is to examine altered expression 
in specific families of genes through careful selection of PCR primers and /or post- 
reaction analysis. Stress genes, growth factors and/or their receptors, cell cycling 
genes, cytochromes P450 and regulatory proteins might be considered as candidates 
for analysis in this way. Indeed, some off-the-shelf DNA arrays (e.g. Clontech's 
Atlas cDNA Expression Array series) already anticipated this to some degree by 
grouping together genes involved in different responses e.g. apoptosis, stress, DNA- 
damage response etc. 



Screening 

False positives 

The generation of false positives has been discussed at length amongst the 
differential display community (Liangs al. 1993, 1995, Nishioef al. 1994, Sun et al. 
1994, Sompayrac et al. 1995). The reason for false positives varies with the 
technique being used. For instance, in RDA, the use of adaptors which have not 
been HPLC purified can lead to the production of false positives through illegitimate 
ligation events (O'Neill and Sinclair 1997), whilst in DD they can arise through 
PCR artifacts and illegitimate transcription of rRNA. In SH, false positives appear 
to be derived largely from abundant gene species, although some may arise from 
cDNA/mRNA species which do not undergo hybridization for technical reasons. 

A quick screening of putative differentially expressed clones can be carried out 
using a simple dot blot approach, in which labelled first strand probes synthesized 
from tester and driver mRNA are hybridized to an array of said clones (Hedrick et 
al. 1984, Sakaguchi et al. 1986). Differentially expressed clones will hybridize to 
tester probe, but not driver. The disadvantage of this approach is that rare species 
may not generate detectable hybridization signals. One option for those using SSH 
is to screen the clones using a labelled probe generated from the subtracted cDNA 
from which it was derived, and with a probe made from the reverse subtraction 
reaction (ClonTechniques 1997a). Since the SSH method enriches rare sequences, 
it should be possible to confirm the presence of clones representing low abundance 
genes. Despite this quick screening step, there is still the need to go back to the 
original mRNA and confirm the altered expression using a more quantitative 
approach. Although this may be achieved using Northern blots, the sensitivity is 
poor by today's high standards and one must rely on PCR methods for accurate and 
sensitive determinations (see below). 



Sequence analysis 

The majority of differential display procedures produce final products which are 
between 100 and lOOObp in size. However, this may considerably reduce the size of 
the sequence for analysis of the DNA databases. This in turn leads to a reduced 
confidence in the result — several families of genes have members whose DNA 
sequences are almost identical except in a few key stretches, e.g. the cytochrome 
P450 gene superfamily (Nelson et al. 1996). Thus, does the clone identified as being 
almost identical to gene X 0 really come from that gene, or its brother gene X x or its 
as yet undiscovered sister X a ? For example, using SSH, part of a gene was isolated, 
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which was up-regulated in the liver of rats exposed to Wy-14,643 and was identified 
by a FASTA search as being transferrin (data not shown). However, transferrin is 
known to be downregulated by hypolipidemic peroxisome proliferators such as Wy- 
14,643 (Hertz et al. 1996), and this was confirmed with subsequent RT-PCR 
analysis. This suggests that the gene sequence isolated may belong to a gene which 
is closely related to transferrin, but is regulated by a different mechanism. 

A further problem associated with SH technology is redundancy. In most cases 
before SH is carried out, the cDNA population must first be simplified by restriction 
digestion. This is important for at least two reasons: 

(1) To reduce complexity — long cDNA fragments may form complex networks 
which prevent the formation of appropriate hybrids, especially at the high 
concentrations required for efficient hybridization. 

(2) Cutting the cDNAs into small fragments provides better representation of 
individual genes. This is because genes derived from related but distinct 
members of gene families often have similar coding sequences that may cross- 
hybridize and be eliminated during the subtraction procedure (Ko 1990). 
Furthermore, different fragments from the same cDNA may differ considerably 
in terms of hybridization and amplification and, thus, may not efficiently do one 
or the other (Wang and Brown 1991). Thus, some fragments from differentially 
expressed cDNAs may be eliminated during subtractive hybridization pro- 
cedures. However, other fragments may be enriched and isolated. As a 
consequence of this, some genes will be cut one or more times, giving rise to two 
or more fragments of different sizes. If those same genes are differentially 
expressed, then two or more of the different size fragments may come through 
as separate bands on the final differential display, increasing the observed 
redundancy and increasing the number of redundant sequencing reactions. 

Sequence comparisons also throw up another important point — at what degree 
of sequence similarity does one accept a result Is 90% identitiy between a gene 
derived from your model species and another acceptably close? Is 95% between 
your sequence and one from the same species also acceptable? This problem is 
particularly relevant when the forward and reverse sequence comparisons give 
similar sequences with completely different gene species! An arbitrary decision 
seems to be to allocate genes that are definite (95% and above similarity) and then 
group those between 60 and 95% as being related or possible homologues. 

Quantitative analysis 

At some point, one must give consideration to the quantitative analysis of the 
candidate genes, either as a means of confirming that they are truly differentially 
expressed, or in order to establish just what the differences are. Northern blot 
analysis is a popular approach as it is relatively easy and quick to perform. However, 
the major drawback with Northern blots is that they are often not sensitive enough 
to detect rare sequences. Since the majority of messages expressed in a cell are of low 
abundance (see table 1), this is a major problem. Consequently, RT-PCR maybe the 
method of choice for confirming differential expression. Although the procedure is 
somewhat more complex than Northern analysis, requiring synthesis of primers and 
optimization of reaction conditions for each gene species, it is now possible to set up 
high throughput PCR systems using mulitchannel pipettes, 96 +-well plates and 
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appropriate thermal cycling technology. Whilst quantitative analysis is more 
desirable, being more accurate and without reliance on an internal standard, the 
money and time needed to develop a competitor molecule is often excessive, 
especially when one might be examining tens or even hundreds of gene species. The 
use of semi-quantitative analysis is simpler, although still relatively involved. One 
must first of all choose an internal standard that does not change in the test cells 
compared to the controls. Numerous reference genes have been tried in the past, for 
example interferon-gamma (IFN-y, Frye et al. 1989), 0-actin (Heuval et al. 1994), 
glyceraldehyde-3-phosphate dehydrogenase (GAPDH, Wong et al. 1994), di- 
hydrofolate reductase (DHFR, Mohler and Butler 1991), j3-2-microglobulin (0-2- 
m, Murphy et al. 1990), hypoxanthine phosphoribosyl transferase (HPRT, Foss et 
al. 1998) and a number of others (ClonTechniques 1997b). Ideally, an internal 
standard should not change its level of expression in the cell regardless of cell age, 
stage in the cell cycle or through the effects of external stimuli. However, it has been 
shown on numerous occasions that the levels of most housekeeping genes currently 
used by the research community do in fact change under certain conditions and in 
different tissues (ClonTechniques 1997b). It is imperative, therefore, that pre- 
liminary experiments be carried out on a panel of housekeeping genes to establish 
their suitability for use in the model system. 

Interpretation of quantitative data must also be treated with caution. By 
comparing the lists of genes identified by differential expression one can perhaps 
gain insight into why two different species react in different ways to external stimuli. 
For example, rats and mice appear sensitive to the non-genotoxic effects of a wide 
range of peroxisome proliferators whilst Syrian hamsters and guinea pigs are largely 
resistant (Orton et al. 1984, Rodricks and Turnbull 1987, Lake et al 1989, 1993, 
Makowska et al. 1992). A simplified approach to resolving the reason(s) why is to 
compare lists of up- and down-regulated genes in order to identify those which are 
expressed in only one species and, through background knowledge of the effects of 
the said gene, might suggest a mechanism of facilitated non-genotoxic carcinogenesis 
or protection. Of course, the situation is likely to be far more complex. Perhaps if 
there were one key gene protecting guinea pig from non-genotoxic effects and it was 
upregulated 50 times by PPs, the same gene might only be up-regulated five times 
in the rat. However, since both were noted to be upregulated, the importance of the 
gene may be overlooked. Just to complicate matters, a large change in expression 
does not necessarily mean a biologically important change. For example, what is the 
true relevance of gene Y which shows a 50-fold increase after a particular treatment, 
and gene Z which shows only a 5-fold increase ? If one examines the literature one 
may find that historically, gene Y has often been shown to be up-regulated 40-60- 
fold by a number of unrelated stimuli — in light of this the 50-fold increase would 
appear less significant. However, the literature may show that gene Z has never been 
recorded as having more than doubled in expression — which makes your 5-fold 
increase all the more exciting. Perhaps even more interesting is if that same 5-fold 
increase has only been seen in related neoplasms or following treatment with related 
chemicals. 

Problems in using the differential display approach 

Differential display technology originally held promise of an easily obtainable 
' fingerprint ' of those genes which are up- or down-regulated in test animals /cells in 
a developmental process or following exposure to given stimuli. However, it has 
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become clear that the fingerprinting process, whilst still valid, is much too complex 
to be represented by a single technique profile. This is because all differential display 
techniques have common and/or unique technical problems which preclude the 
isolation and identification of all those genes which show changes in expression. 
Furthermore, there are important genetic changes related to disease development 
which differential expression analysis is simply not designed to address. An example 
of this is the presence of small deletions, insertions, or point mutations such as those 
seen in activated oncogenes, tumour suppressor genes and individual poly- 
morphisms. Polymorphic variations, small though they usually are, are often 
regarded as being of paramount importance in explaining why some patients 
respond better than others to certain drug treatments (and, in logical extension, why 
some people are less affected by potentially dangerous xenobiotics /carcinogens than 
others). The identification of such point mutations and naturally occurring 
polymorphisms requires the subsequent application of sequencing, SSCP, DGGE 
or TGGE to the gene of interest. Furthermore, differential display is not designed 
to address issues such as alternatively spliced gene species or whether an increased 
abundance of mRNA is a result of increased transcription or increased mRNA 
stability. 



Conclusions 

Perhaps the main advantage of open system differential display techniques is that 
they are not limited by extant theories or researcher bias in revealing genes which are 
differentially expressed, since they are designed to amplify all genes which 
demonstrate altered expression. This means that they are useful for the isolation of 
previously unknown genes which may turn out be useful biomarkers of a particular 
state or condition. At least one open system (SAGE) is also quantitative, thus 
eliminating the need to return to the original mRNA and carry out Northern /PCR 
analysis to confirm the result. However, the rapid progress of genome mapping 
projects means that over the next 5-10 years or so, the balance of experimental use 
will switch from open to closed differential display systems, particularly DNA 
arrays. Arrays are easier and faster to prepare and use, provide quantitative data, are 
suitable for high throughput analysis and can be tailored to look at specific signalling 
pathways or families of genes. Identification of all the gene sequences in human and 
common laboratory animals combined with improved DNA array technology, 
means that it will soon no longer be necessary to try to isolate differentially expressed 
genes using the technically more demanding open system approach. Thus, their 
main advantage (that of identifying unknown genes) will be largely eradicated. It is 
likely, therefore, that their sphere of application will be reduced to analysis of the 
less common laboratory species, since it will be some time yet before the genomes of 
such animals as zebrafish, electric eels, gerbils, crayfish and squid, for example, will 
be sequenced. 

Of course, in the end the question will always remain: What is the functional/ 
biological significance of the identified, differentially expressed genes? One 
persistent problem is understanding whether differentially expressed genes are a 
cause or consequence of the altered state. Furthermore, many chemicals, such as 
non-genotoxic carcinogens, are also mitogens and so genes associated with 
replication will also be upregulated but may have little or nothing to do with the 
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carcinogenic effect. Whilst differential display technology cannot hope to answer 
these questions, it does provide a springboard from which identification, regulatory 
and functional studies can be launched. Understanding the molecular mechanism of 
cellular responses is almost impossible without knowing the regulation and function 
of those genes and their condition (e.g. mutated). In an abstract sense, differential 
display can be likened to a still photograph, showing details of a fixed moment in 
time. Consider the Historian who knows the outcome of a battle and the placement 
and condition of the troops before the battle commenced, but is asked to try and 
deduce how the battle progressed and why it ended as it did from a few still 
photographs— an impossible task. In order to understand the battle, the Historian 
must find out the capabilities and motivation of the soldiers and their commanding 
officers, what the orders were and whether they were obeyed. He must examine the 
terrain, the remains of the battle and consider the effects the prevailing weather 
conditions exerted. Likewise, if mechanistic answers are to be forthcoming, the 
scientist must use differential display in combination with other techniques, such as 
knockout technology, the analysis of cell signalling pathways, mutation analysis and 
time and dose response analyses. Although this review has emphasized the 
importance of differential gene profiling, it should not be considered in isolation and 
the full impact of this approach will be strengthened if used in combination with 
functional genomics and proteomics (2-dimensional protein gels from isoelectric 
focusing and subsequent SDS electrophoresis and virtual 2D-maps using capillary 
electrophoresis). Proteomics is attracting much recent attention as many of the 
changes resulting in differential gene expression do not involve changes in mRNA 
levels, as decribed extensively herein, but rather protein-protein, protein-DNA and 
protein phosphorylation events which would require functional genomics or 
proteomic technologies for investigation. 

Despite the limitations of differential display technology, it is clear that many 
potential applications and benefits can be obtained from characterizing the genetic 
changes that occur in a cell during normal and disease development and in response 
to chemical or biological insult. In light of functional data, such profiling will 
provide a 'fingerprint* of each stage of development or response, and in the long 
term should help in the elucidation of specific and sensitive biomarkers for different 
types of chemical/biological exposure and disease states. The potential medical and 
therapeutic benefits of understanding such molecular changes are almost im- 
measurable. Amongst other things, such fingerprints could indicate the family or 
even specific type of chemical an individual has been exposed to plus the length 
and/or acuteness of that exposure, thus indicating the most prudent treatment. 
They may also help uncover differences in histologically identical cancers, provide 
diagnostic tests for the earliest stages of neoplasia and, again, perhaps indicate the 
most efficacious treatment. 

The Human Genome Project will be completed early in the next century and the 
DNA sequence of all the human genes will be known. The continuing development 
and evolution of differential gene expression technology will ensure that this 
knowledge contributes fully to the understanding of human disease processes. 
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SUMMARY 



The technique of differential display reverse transcription-polymerase chain reaction (ddRT-PCR) has been used to produce unique 
orofiles of up-regulated and down-regulated gene expression in the liver of male Wistar rats following short term exposure to the 
non genotoxic hepatocarcinogens, phenobarbital and WY-14,643. Animals were treated for 3 days, whereupon their livers were 
extracted and snap frozen. mRNA was prepared from the livers and used for ddRT-PCR. Individual bands from the differential 
displays were extracted and cloned. False positives were eliminated by dotblot screening and true positives then sequenced and 
identified. 



INTRODUCTION 

Safety evaluation of. new chemicals usually necessi- 
tates the examination of genotoxic and carcinogenic 
potential using short-term in vitro and in vivo geno- 
toxicity assays augmented by chronic bioassay tests. 
The short-term assays have proved useful in the early 
identification of potential genotoxic carcinogens, but 
their value is limited by observations which suggest 
that approximately 60% of chemicals identified as car- 
cinogens in life-exposure studies produce mainly 
negative findings in short-term genotoxicity tests (1,2). 
Thus, there is currently no reliable and rapid means of 
evaluating the carcinogenic risk of new chemicals 
which fall into this latter group of compounds, termed 
non-genotoxic (or epigenetic) carcinogens. 
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It is now evident that non-genotoxic carcinogens 
constitute a group of chemicals which are not only di- 
vergent in their interspecies toxicity, but also demon- 
strate different target organ selectivities and mecha- 
nisms of action (3,4). Elucidation of the molecular 
mechanisms underlying non-genotoxic carcinogenesis 
is currently underway, but the picture is still far from 
complete. It is anticipated that a better understanding 
of the early changes in genetic expression following 
exposure to non-genotoxic carcinogens will aid devel- 
opment of experimental strategies to identify cellular 
markers which are diagnostic for this type of toxicity. 

Subtractive ddRT-PCR is a recently developed 
technique which facilitates the preferential amplifica- 
tion of gene products that demonstrate altered expres- 
sion in target tissue(s) following exposure to chemical 
stimuli. Furthermore, using this technique, no prior 
knowledge of the specific genes which are up/down 
regulated is required. In the current study, we have un- 
dertaken to develop a specific and rapid assay for non- 
genotoxic carcinogens using the technique of ddRT- 
PCR. This has allowed us to identify characteristic 
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patterns of gene regulation following administration of 
two different non-genotoxic carcinogens (phenobarbi- 
tal and Wy-14,643) and the subsequent identification 
of individual gene species which are regulated by this 
xenobiotic treatment. 



MATERIALS AND METHODS 
Animals and treatment 

Phenobarbital (BDH, Poole, UK; 100 mg/kg/day) or 
[4-chloro-6-(2,3-xylidino)-2-pyrimidinylthio] acetic 
acid (Wy- 14,643) (Campo, Emmerich; 250 
mg/kg/day) was administered by gavage to groups of 
3 male Wistar rats (150-200 g) on three consecutive 
days, whilst control animals received nothing. All ani- 
mals had free access to food (rat and mouse standard 
diet, B&K Universal, Hull, UK) and water. The ani- 
mals were killed on the fourth day, whereupon their 
livers were excised, sliced into 0.5 cm cubes, snap fro- 
zen in liquid nitrogen and then stored at -70°C. 



mRNA extraction 

Up to 0.25 g of each frozen liver sample was ground 
under liquid nitrogen using a mortar and pestle. 
mRNA was extracted from the ground liver using 
Promega's PolyATtract® System 1000 (Promega, 
Madison, WI, USA) according to the technical man- 
ual. The mRNA was DNase-treated (Promega, final 
concentration 10 U/ml) before phenol/chloroform ex- 
traction and ethanol precipitation. The mRNA was re- 
suspended at a final concentration 500-1000 ng/\il 



ddRTVPCR 

This was carried out using the PCR-Select™ cDNA 
Subtraction Kit (Clontech, Palo Alto, CA, USA) ac- 
cording to the manufacturer's instructions. Final PCR 
reactions were run on a 2% Metaphor agarose (FMC, 
Rockland, MD, USA) gel containing ethidium bro- 
mide (Sigma, Dorset, UK) and then overstated for 30 
min with SYBR Green 1 DNA stain (FMC, 1:10 000 
dilution in TAE). 



Band extraction and cloning 

Each discernible band from the differential display 
pattern was extracted from the gel with a scalpel and 



the DNA eluted using a Genelute™ Agarose Spin Col- 
umn (Supelco, Bellefonte). An aliquot of the eluted 
DNA (5 |il) was re-amplified using the original ddRT- 
PCR nested primers and electrophoresed on a 2% 
agarose gel. The re-amplified band was extracted from 
the gel (as above) and the eluted DNA ligated directly 
into the TOPO TA Cloning® vector (Invitrogen, 
Carlsbad) before transformation in Escherichia coli 
TOPI OF One Shot™ cells (Invitrogen). 



Stage 1 screening 

Twelve transformed (white) colonies from each band 
were grown up for 6 h in 200 ^il LB broth containing 
ampicillin (Sigma, 50 ^lg/ml) and 1 p.1 of this ampli- 
fied by PCR reaction (as specified in ddRT-PCR tech- 
nical manual). One quarter of the completed reaction 
was electrophoresed on a standard 2% agarose gel and 
one quarter on a 2% agarose gel containing HA Yel- 
low (Hanse Analytik GmbH, Bremen, Germany, 1 
U/|il) to discern the different cloning products. The re- 
mainder was used to prepare duplicate dotblots on Hy- 
bond N+ (nylon) membranes (Amersham, Little Chal- 
font, UK). Cultures containing different cloning prod- 
ucts were grown up and a plasmid miniprep prepared 
from each (Wizard Plus SV Minipreps DNA Purifica- 
tion System, Promega) according to the manufac- 
turer's instructions. 



Stage II screening 

The duplicate dotblots were probed with: (a) the final 
differential display reaction; and (b) the Reverse-sub- 
tracted' differential display reaction. To make the •re- 
verse-subtracted' probe, the subtractive hybridisation 
step of the ddRT-PCR procedure was carried out using 
the original tester cDNA as a driver and the driver as 
a tester. Probing and visualisation were carried out us- 
ing the ECL Direct Nucleic Acid Labelling and Detec- 
tion System (Amersham) according to the manufac- 
turer's instructions. Those clones which were positive 
for (a) but negative for (b), or showed a substantially 
larger positive signal with (a) compared to (b), were 
chosen for further analysis. 



DNA sequencing 

Positive clones as identified above were sequenced on 
an automated ABI DNA sequencer (Applied Biosys- 
tems, Warrington, UK). 
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Fig. ] : (A) Subtractivc ddRT-PCR patterns obtained from rat liver following 3-day treatment with WY- 14,643 or phenobarbital. Lane 
1, 1 kb ladder, lane 2, genes up-regulated following Wy, 14-643 treatment; lane 3, genes down-regulated following 
Wy,14-643 treatment; lane 4, genes up-regulated following phenobarbital treatment; lane 5, genes down-regulated following 
phenobarbital treatment; and lane 6, lkb ladder. (B) Subtracove ddRT-PCR patterns obtained from rat bver showing relative 
changes when phenobarbital treated mRNA is subtracted from Wy-14,643-treated mRNA and vice-versa. Lane 1, 1 kb 
ladder, lane 2, genes showing increased expression following Wy- 14,643 treatment compared to phenobarbital treatment; 
lane 3, genes showing increased expression following phenobarbital treatment compared to Wy- 14,643 treatment. See 
Materials and Methods for further details. 




Fig. 2 : Re-amplified ddRT-PCR products which were down-regulated following phenobarbital treatment (upregulated bands were also 
re-amptified but gel not shown). Individual DNA bands excised from gel of ddRTR-PCR reactions were extracted, 
re-ampbfied and run on agarose gels to confirm amplification of correct band (numbered). See Materials and Methods for 
further details. 
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Table I : Rat liver genes down-regulated by phenobarbital treatment 



Band number (Fig. 2) 
(Approximate size in bp) 




Phenobarbital down- 


-regulated 


H ighest sequence homology 


FASTA-EMBL gene identification 


1 (1500) 




95.3% 


Rat mRNA for 3-oxoacyl-CoA thiolase 


2 (1200) 




92.3% 


Rat hemopoxin mRNA 


3 (1000) 




91.7% 


ft rattus alpha-2u-globulin mRNA 


7 (700) 


Clone 1 


77.2% 


M. musculus mRNA for CI inhibitor 




Ctone 2 


94.5% 


Rat electron transfer flavonrntoin 




Clone 3 


91.0% 


Mouse toooisomerase 1 fTnno i\ mQKi a 


6 (650) 


Done 1 


86.9% 


Soares 2NbMT M murrain* /pqti 




Clone 2 


96.2% 


Rat aJDha-2u-olobulin fs-tvnei mPMA 


9 (600) 


Clone 1 


86.9% 


Cn^rp^ mm ico KJM1 Avf mii^midui /ceT\ 
wuaico iiiuudc rxrviL /VI. rnUSCUJUS (CO I 1 




Clone 2 


82.0% 


Soares D3NMF1Q ^ M mucr>nhH* /ccn 


10 (550) 




73.8% 


wwoico niuudc i^rviL, ivi. rnuscutUS (col) 


11 (525) 




95.7% 


MCI CRAP Pr1 W camnne /CCTt 

Mwi_vv3«r_rr i n. sapiens (col) 


12 (375) 




100.0% 


n. i lui vcyiuus mniNM Tor riDosomai protein 


13 (230) 


Clone 1 


97.2% 


Soares mouse embryo NbME135 (EST) 




Clone 2 


100.0% 


Rat fibrinogen B-beta-chain 




Clone 3 


100.0% 


Rat apolipoprotein E gene 


14 (170) 




96.0% 


Soares p3NMF19.5 M. musculus (EST) 


15 (140) 




97.3% 


Stratagene mouse testis (EST) 


Others: (300) 




96.7% 


ft norvegicus RASP 1 mRNA 


(275) 




93.1% 


Soares mouse mammary gland (EST) 



EST = expressed sequence tag. 
Bands 4-6 were shown to be false positives by dotblo( analysis and, therefore, no( sequenced. 



Table 11 : Rat liver genes up-regulated by phenobarbital treatment 



Band number 
(Approximate size in bp) 




Phenobarbital up-regulated 


Highest sequence 


homology 


FASTA-EMBL gene identification 


5 (1300) 




93.5% 


Rat cytochrome P450IIB1 


7(1000) 




95.1% 


mRNA for rat preproalbumin 








Rat serum albumin mRNA 


8 (950) 




98.3% 


NCLCGAP_Pr1 H. sapiens (EST) 


10(850) 




95.7% 


Rat cytochrome P450IIB1 


11 (800) 


Clone 1 


94.9% 


Rat cytochrome P450IIB1 




Clone 2 


75.3% 


Rat cytochrome p450-L (p450IIB2) 


12 (750) 




93.8% 


Rat TRPM-2 mRNA 








Rat mRNA for sulfated glycoprotein 


15(600) 




92.9% 


mRNA for rat preproalbumin 








Rat serum albumin mRNA 


16 (550) 


Clone 1 


95.2% 


Rat cytochrome P450IIB1 




Clone 2 


93.6% 


Rat haptoglobulin mRNA partial alpha 


21 (350) 




99.3% 


ft norvegicus genes for 18S. 5.8S & 28S rRNA 



EST = expressed sequence tag. 
Bands 1-4, 6, 9 t 13. 14 and 17-20 shown to be false positives by dotblot analysis and, therefore, not sequenced. 
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Identification of differentially-regulated 
genes 

" Gene-sequences were identified using the FASTA pro- 
gramme (httpy/www.ebi.ac.uk/htbin/fasu.py?request) 
to search all EMBL databases for matching DNA se- 
quences. 

RESULTS 

Figure 1A,B shows the ddRT-PCR patterns of genes 
showing altered expression in rat liver following 3 day 
treatment with phenobarbita] or Wy-14,643. Individual 
bands were isolated from the phenobarbitaj-modulated 
patterns (both up- and down-regulated), re-amplified 
(Fig. 2), cloned, screened for false positives and then 
identified. Those xenobiotic-modulated gene products 
identified to date are listed in Tables I and II. 

DISCUSSION 

The advent of combinatorial chemistry has led to the 
synthesis of millions of new chemical compounds, 
many of which may be potentially useful in pharma- 
ceutical, agricultural or industrial applications. How- 
ever, whilst there are tests available for those posing a 
genotoxic activity, there remains no short-term assay 
able to identify those chemicals which may belong to 
the non-genotoxic group of carcinogens. 

We have used an adaptation of the subtractive hy- 
bridisation method - ddRT-PCR - to produce charac- 
teristic profiles or 'fingerprints' of those genes which 
are up-regulated or down-regulated in male rat liver 
following acute exposure to test chemicals. The ddRT- 
PCR profiles are characteristic and unique for each of 
the 2 compounds studied to date. 

A number of those gene species showing altered 
expression following phenobarbita! treatment have 
been cloned and identified (Tables I & II). It is inter- 
esting to note the presence of CYP2B2 in the up-regu- 
lated genes. This would, of course, be expected fol- 
lowing exposure to phenobarbital and serves as a posi- 
tive control for the method. Other genes which one 
might normally expect to be up-regulated do not ap- 
pear in the table. However, it should be noted that not 
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all bands seen on the differential display were ex- 
tracted and re-amplified due to their being too faint or 
too close to other bands to accurately excise. Further- 
more, it has been well documented [(5) and references 
therein] that a single band extracted from a differential 
display often represents a composite of heterogeneous 
products. We are currently examining new methods to: 
(i) improve resolution of the differential display pat- 
terns (including 2-D agarose gels); and (ii) distinguish 
those ddRT-PCR products which are identical in size, 
but different in sequence. 

Our future efforts will be directed towards deter- 
mining the extent of modulation of a number of the 
genes reported herein using semi-quantitative RT- 
PCR. This should reveal the extent of changes in ex- 
pression of key gene products which may be involved 
in non-genotoxic hepatocarcinogenesis and thus help 
increase understanding of this process. Furthermore, it 
is anticipated that aligning ddRT-PCR profiles of dif- 
ferent non-genoioxic agents found in responsive and 
non-responsive species may enable identification of 
those genes which are mechanistically relevant to the 
non-genotoxic hepatocarcinogenic process. Accord- 
ingly, this approach lends itself well to the identifica- 
tion, characterisation and sub-classification of possible 
different classes of non-genotoxic carcinogens. 

ACKNOWLEDGEMENT 

This work was funded by Rhone-Poulenc Agrochemicals, 
France 



REFERENCES 

1. Parodi S. (1992) : Non-genotoxic factors in the carcinogenic 
process: problems of detection and hazard evaluation Toxicol 
Lett.. 64/65. 621-630. 

2. Ashby J. (1992) : Prediction of non-genotoxic carcinogenesis 
Toxicol. Lett, 64/65, 605-612. 

3. Grasso G. and Sharratt M. (1991) : Role of persistent, 
non-genotoxic tissue damage in rodent cancer and relevance to 
humans. Annu. Rev. Pharmacol. Toxicol., 31, 253-287. 

4. Lake B. (1995) : Mechanisms of hepaiocarcinogenicity of 
pcroxisome-proliferating drugs and chenucals. Annu. Rev. 
Pharmacol. Toxicol., 35, 483-507. 

5. Smith N.R., Li A., AJdersley M., High A.S., Markham A.E.. 
Robinson P.A. (1997) : Rapid determination of the complexity 
of cDNA bands extracted from DDRT-PCR polyacrylarnide 
gels. Nucleic Acids Research 25 (17). 3552-3554. 



Exhibit Eof Rockett Declaration ) 

with Response dated 03/1 8/04 

In USSN: 10/031,904 ' 



www.elsevicr.com/locate/toxicol 

Use of suppression-PCR subtractive hybridisation to identify 
genes that demonstrate altered expression in male rat and 
guinea pig livers following exposure to Wy-14,643, a 
peroxisome proliferator and non-genotoxic hepatocarcinogen 

John C Rockett ] , Karen E. Swales, David J. Esdaile 2 , G. Gordon Gibson * 

Molecular Toxicology Group, School of Biological Sciences, University of Surrey, Guildford, Surrey GU2 5XH> UK 




ELSEVIER Toxicology 144 (2000) 13-29 



Abstract 

Understanding the genetic profile of a cell at all stages of normal and carcinogenic development should provide an 
essential aid to developing new strategies for the prevention, early detection, diagnosis and treatment of cancers. We 
have attempted to identify some of the genes that may be involved in peroxisome-proliferator (PP)-induced 
non-genotoxic hepatocarcinogenesis using suppression PCR subtractive hybridisation (SSH). Wistar rats (male) were 
chosen as a representative susceptible species and Duncan-Hartley guinea pigs (male) as a resistant species to the 
hepatocarcinogenic effects of the PP, [4-chIoro-6-(2,3-xylidino)-2-pyrimidinylthio] acetic acid (Wy-1 4,643). In each 
case, groups of four test animals were administered a single dose of Wy-1 4,643 (250 mg/kg per day in corn oil) by 
gastric intubation for 3 consecutive days. The control animals received corn oil only. On the fourth day the animals 
were killed and liver mRNA extracted. SSH was carried out using mRNA extracted from the rat and guinea pig 
livers, and used to isolate genes that were up and downregulated following Wy-14,643 treatment. These genes 
included some predictable (and hence positive control) species such as CYP4A1 and CYP2C11 (upregulated and 
downregulated in rat liver, respectively). Several genes that may be implicated in hepatocarcinogenesis have also been 
identified, as have some unidentified species. This work thus provides a starting point for developing a molecular 
profile of the early effects of a non-genotoxic carcinogen in sensitive and resistant species that could ultimately lead 
to a short-term assay for this type of toxicity. © 2000 Elsevier Science Ireland Ltd. All rights reserved. . 
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RT-PCR; Rat; Guinea pig; Gene regulation; Differential gene display; Gene profiling 
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Introduction 

The advent of combinatorial chemistry and 
nputer-aided drug design has led to a recent 
surge in the number of chemical compounds 
it have potential therapeutic, agricultural and 
lustrial applications. Although it has been sug- 
ted that the contribution of synthetic chemicals 
the overall incidence of human cancer is low, 
re still remains an absolute requirement to 
iluate all new chemicals for toxic and carcino- 
id potential. The latter is one of the most 
>blematic areas of chemical safety evaluation 
1 is usually carried out using short-term in vitro 
1 in vivo genotoxicity assays augmented by 
onic bioassay tests. The short-term assays have 
>ved useful in the early identification of poten- 
genotoxic carcinogens, but their value is lim- 
I by observations that suggest that 
>roximately 60% of chemicals identified as car- 
Dgens in life-exposure studies produce mainly 
ative findings in short-term genotoxcity tests 
hby, 1992; Parodi, 1992). Thus, there is cur- 
tly no reliable and rapid means of evaluating 
carcinogenic risk of new chemicals that fall 
) this latter group of compounds, termed non- 
otoxic (or epigenetic) carcinogens. 
)ne approach to addressing this problem is to 
ridate the molecular mechanisms by which 
•wn non-genotoxic carcinogens act. It should 
l be possible to identify common factors/ 
:hanisms that can serve as early biomarkers of 
;inogenic potential for new chemicals. To this 
, a large number of groups have reported on 
various effects of non-genotoxic compounds 
various animal species (Marsman et al., 1988; 
:e et al., 1993; Cattley et ah, 1994; Hayashi et 
1994; Human and Experimental Toxicology, 
4; Anderson et al., 1996). However, the mech- 
tic picture is still far from complete with many 
hose genes involved in the carcinogenic pro- 
remaining unknown, and their identification 
efore remains a key goal in elucidating the 
scular mechanisms by which non-genotoxic 
inogenesis occurs. 

jbtractive hybridisation (SH) and related tech- 
•gies such as representational difference analy- 
(RDA) (Hubank and Schatz, 1994) and 



differential display (DD) (Liang and Pardee, 
1992) can be used to aid the isolation of genes 
showing altered expression in target tissues fol- 
lowing exposure to a chemical stimulus. These 
techniques can also be used to identify differential 
gene expression in neoplastic and normal cells 
(Liang et al., 1992), infected and normal cells 
(Duguid and Dinauer, 1990), differentiated and 
undifferentiated cells (Sargent and Dawid, 1983; 
Guimaraes et al., 1995), activated and dormant 
cells (Gurskaya et al., 1996; Wan et al., 1996), 
different cell types (Hedrick et al., 1984; Davis et 
al., 1984) amongst others. Most importantly, us- 
ing such approaches, no prior knowledge of the 
specific genes that are upregulated/downregulated 
is required. 

Using a variation of SH, termed suppression- 
PCR subtractive hybridisation (SSH) (Diatchenko 
et al., 1996), we have previously reported the 
isolation of a number of genes showing altered 
expression in male rat liver following acute expo- 
sure to phenobarbital (Rockett et al., 1997). In 
the current work we have used the same experi- 
mental approach to isolate genes that are differen- 
tially expressed in the livers of male rats and 
guinea pigs following short-term (3-day) exposure 
to the peroxisome proliferator (PP) and non- 
genotoxic hepatocarcinogen, Wy-14,643. We have 
isolated and identified a number of gene species, 
some of which may be important in the induction 
of, or protection against, non-genotoxic 
hepatocarcinogenesis. 



2. Materials and methods 

2.1. Animals and treatment 

All animal experiments were undertaken in ac- 
cordance with Her Majesty's Home Office De- 
partment guidelines under the auspices of 
approved personal and project licences. Male 
Wistar rats (150-200 g) and male Duncan-Hart- 
ley guinea pigs (250-300 g) were obtained from 
Kingman and Bantam (Hull, UK). Upon receipt, 
both groups were randomly assigned into two 
groups of four. They were maintained on a rat, 
mouse or guinea pig standard diet (B&K Univer- 
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sal, Hull) and a daily cycle of alternating 12-h 
. periods of dark and light. The room temperature 
was maintained at 19°C and a relative humidity of 
55%. The animals were acclimatised to this envi- 
ronment for 7 days before treatment commenced. 
[4-chloro-6-(2,3-xylidino)-2'pyrimidinylthio] acetic 
acid (Wy- 14,643, Campo, Emmerich; 250 mg/kg 
per day in corn oil) was administered by gavage 
to the treated groups of rats and guinea pigs on 3 
consecutive days, whilst control groups received 
an equal volume of corn oil only. During this 
time, all animals had free access to food and 
water. The animals were killed by cervical disloca- 
tion on the fourth day, and their livers immedi- 
ately excised, weighed, sliced into approximately 
0.5-cm cubes, snap frozen in liquid nitrogen and 
stored at - 70°C 

2.2. mRNA extraction 

Approximately 0.25 g of each frozen liver sam- 
ple was ground under liquid nitrogen using a 
mortar and pestle. Messenger RNA was extracted 
from the ground liver using the PolyATtract® 
System 1000 kit (Promega, Madison, USA) ac- 
cording to the technical manual provided by the 
manufacturers. The mRNA was DNase-treated 
(RQ Rnase-free Driase, Promega, final concentra- 
tion 10 U/ml) before phenol/chloroform extrac- 
tion and ethanol precipitation. The mRNA was 
redissolved at a final concentration 500-1000 ng/ 
Id. 

2.3. cDNA Subtraction 

This was carried out using the PCR-Select™ 
cDNA Subtraction Kit (Clontech, Palo Alto, 
USA) according to the manufacturer's instruc- 
tions. Subtractions were carried out with mRNAs 
derived from single animals. The mRNA from the 
remaining three animals in each group was later 
used for quantitative RT-PCR analysis of specific 
genes. 

2.4. Band extraction and cloning 

The secondary PCR reactions from the cDNA 
subtraction procedure were run on a 2% 
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Metaphor agarose gel (FMC, Rockland, USA) 
containing 0.5 ^g/ml ethidium bromide (Sigma, 
Dorset, UK). One times TAE (0.04 M Tris-ac- 
etate, 0.001 M EDTA) was used to prepare the gel 
and as the running buffer. After running for 6-7 
h at 3.75 V/cm, the gel was overstained for 30 min 
with SYBR Green I DNA stain (FMC, 1:10000 
dilution in 1 x TAE). Each discernible band of 
the differential display pattern was extracted from 
the gel with a scalpel and the DNA eluted using a 
Genelute™ agarose spin column (Supelco, Belle- 
fonte, USA). Five microlitres of the eluted DNA 
was reamplified using the original nested (sec- 
ondary) PCR primers supplied with the PCR-Se- 
lect™ cDNA subtraction kit. The PCR products 
were electrophoresed on a 2% standard agarose 
gel (Boehringer Mannheim, East Sussex, UK) and 
the reamplified target bands extracted from the 
gel as above. The eluted DNA was immediately 
ligated into a TOPO TA Cloning® vector (Invitro- 
gen, Carlsbad, USA) before transformation in 
Escherichia coli TOPI OF' One Shot™ cells 
(Invitrogen). 

2.5. Colony screening 

2.5 A. Stage I 

Eight transformed (white) colonies from each 
band were grown up for 6 h in 200 ^1 LB broth 
containing ampicillin (Sigma, 50 mg/ml). One mi- 
crolitre of this was subjected to PCR using the 
same conditions and nested primers as described 
above. One tenth (2 \x\) of the completed PCR 
reaction was electrophoresed on a 2% standard 
agarose gel and one tenth on a 2% standard 
agarose gel containing HA red (Hanse Analytik 
GmbH, Bremen, Germany, 1 U/ml) to discern the 
differentially cloned products. The remainder of 
the PCR reaction was used to prepare duplicate 
dotblots on Hybond N + membranes (Amersham, 
Little Chalfont, UK). 

2.5:2. Stage I J 

The duplicate dotblots were probed with (a) the 
final differential display reaction and (b) the 're- 
verse-subtracted' differential display reaction. To 
make the 'reverse-subtracted' probe, the subtrac- 
tive hybridisation step of the differential display 
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.T-PCR procedure was carried out using the 
riginal tester (treated) mRNA as the driver and 
{e original driver (control) mRNA as the tester, 
robing and visualisation were carried out using 
ie ECL direct nucleic acid labelling and detec- 
on system (Amersham, Little Chalfont, UK) ac- 
Drding to the manufacturer's instructions. Those 
ones that were positive for (a) but negative for 
)), or showed a substantially larger positive sig- 
al with (a) compared to (b), were selected for 
>NA sequence analysis. 

6. DNA sequencing 

The remainder of the cultures (prepared in 
age I screening) containing different cloning 
roducts (as discerned in the two screening steps) 
ere grown up overnight in 5 ml LB broth con- 
Jning ampicillin (50 mg/ml). A plasmid miniprep 
as prepared from each (Wizard Plus SV 
(inipreps DNA purification system, Promega) 
xording to the manufacturer's instructions. The 
oned inserts were sequenced on an automated 
BI DNA sequencer (Applied Biosystems, War- 
ngton, UK) using the M13 forward primer 
jTAAAACGACGGCCAGT) or M13 reverse 
imer ( A AC A GCTATG A CC ATG) . 

7. Identification of differentially regulated genes 

Gene sequences thus obtained were identified 
;ing the FASTA 3.0 programme (Lipman and 
:arson, 1985; Pearson and Lipman, 1988) (http:/ 
/ww.ddbj.nig.ac.jp/E-mail/homology.html) to 
arch all EMBL databases for matching DNA 
quences. Each clone sequence was submitted in 
e forward and reverse direction, and the one 
turning the highest statistical probability of 
atch to a known sequence was noted. Sequence 
>mologies between our submitted clone sequence 
id the queried database sequence were deter- 
ined (by FASTA) over a region of at least 60 
ise pairs. 

8. RT-PCR analysis of selected candidate genes 

cDNA sequences of the target genes were ob- 
med from the NIH gene database (GenBank at 



http://www.ncbi.nlm.nih.gov/Web/Search/index. 
html) and the computer programme gene 
jockey (BioSoft, Cambridge, UK) used to select 
primer pairs from these sequences. Where guinea 
pig sequences were available, rat and guinea pig 
sequences were aligned and primers chosen from 
regions of homology. If guinea pig sequences were 
not available, rat and human sequences were 
used. In cases where exact homology could not be 
found, the sequence from the rat was used. In the 
case of CD81 only, no rat or guinea pig sequences 
were available and so mouse and human se- 
quences were aligned and a primer pair chosen 
from a region of homology. Primers (obtained 
from Gibco-BRL, Paisley, UK) were dissolved at 
a concentration of 50 pmol/nl in sterile distilled 
water and stored at - 20°C. The primer pairs 
used plus other reaction parameters are shown in 
Table 1. mRNA was extracted (as described 
above) from all four treated animals and from 
three animals in the control group. Integrity of 
the eluted mRNA was confirmed on a 2% agarose 
gel, and the concentration and purity were mea- 
sured using a Genequant II spectrophotometer 
(LKB, Bromma, Sweden) and then diluted to 10 
ng/nl. One microlitre of this latter solution was 
used per RT-PCR reaction. 

RT-PCR was carried out in a single tube (50 jil) 
reaction using the Access RT-PCR system 
(Promega) according to manufacturer's instruc- 
tions. In the kinetic and quantitative analyses, 
omission of RNA was used as a control for the 
presence of any contaminating DNA. After ob- 
taining a PCR signal of the correct size and 
optimising the reaction conditions, each PCR 
product was digested with between two and four 
separate restriction enzymes. Specific restriction 
patterns were thus obtained, which further confi- 
rmed the identity of the PCR products as being 
the original target genes. Kinetic analysis (14-32 
cycles) was then performed in each case to deter- 
mine the location of the mid-log phase. 

For the semi-quantitative analysis of each 
target gene, RT-PCR reactions were carried out in 
triplicate for each sample to reduce the effect of 
intertube RT-reaction variations (Kolls et al M 
1993) and pipetting errors. For each gene, a mas- 
termix containing enough reagents for three times 
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the number of samples (seven for rat, six for 
guinea pig) was prepared except that mRNA was 
omitted, the latter being added after aliquoting 49 
y\ of the mastermix into an appropriate number 
of tubes. Amplification of albumin (the reference 
gene) was carried out in separate tubes since the 
mid-log phase of this gene is at a much lower 
cycle number than the target genes due to its high 
abundance. All RT-PCR products were analysed 
on 2% agarose gels containing 0.5 ng/ml ethidium 
bromide. The target gene samples were loaded on 
the gel first and run in at 3 V/cm for 10 min. The 
corresponding albumin samples were then loaded 
and the gel run for a further 1/2 h. In this way, all 



RT-PCR products from each target gene and 
albumin from the corresponding samples could be 
run on the same gel. Gels were photographed 
using type 665 posi-neg film (Sigma) and quanti- 
tation of the band intensity was carried out using 
a dual wavelength flying spot laser scanner densit- 
ometer (Shimadzu). 

2.9. Statistical analysis 

Statistical analysis of unpaired samples was car- 
ried out using the two-tailed Student's Mest. Val- 
ues were considered statistically significant at 
P < 0.05 or less. 



L 1 2 



L 1 2 



3. Results 



l'i IS 



.«w *ppjj 

m Sp m 



A B 

Fig. I. Final displays of differentially expressed genes that 
were (1) upregulated and (2) downregulated in rat (A) and 
guinea pig (B) livers following 3-day treatment with Wy- 
14,643. mRNA extracted from control and treated livers was 
used to generate the differential displays using the PCR-Select 
:DNA subtraction kit (Clontech). Lane (L) is a 1 Kb DNA 
Ladder standard and 10 ul of secondary PCR reaction were 
loaded in all other lanes. 



3.1. Cloning and screening of transcripts 

For both the rat and guinea pig experimental 
groups, cDNA subtraction was carried out in the 
forward (control driving tester) and reverse (tester 
driving control) directions to isolate both upregu- 
lated and downregulated mRNA species respec- 
tively. Using a standard primary hybridisation 
time of 8 h we obtained a substantial amount of 
non-specific products in all the final differential 
displays (data not shown). This background 
smearing was almost completely removed by re- 
ducing the primary hybridisation time to 4 h 
(CLONTECHniques, 1996). Fig. 1 shows the 
ddRT-PCR patterns of genes showing altered ex- 
pression in rat and guinea pig liver following 
3-day treatment with Wy- 14,643. The profiles are 
unique for each species, and in each case the 
profile for the upregulated genes (control mRNA 
driving tester mRNA) is different to that obtained 
for the downregulated genes (tester mRNA driv- 
ing control mRNA). 

The practical outcome of the SSH method is 
that a series of differentially expressed genes is 
observed as a ladder on an agarose gel. The 
majority of these gene fragments fall within the 
150-2000 bp range, with bands up to 5 Kbp 
occasionally being observed. Each band may the- 
oretically consist of one or more products of 
similar size, as the gel has a maximum resolution 
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m 



Baud 55 
.Clones 1 -6 




Fig. 2. Discrimination of different ddRT-PCR products having 
the same molecular size using HA-red. Gel (A) is a 2% 
standard agarose gel. Gel (B) is a 2% standard agarose gel 
containing 1 U/ml HA-red. Band numbers refer to the sequen- 
tial bands (largest to smallest) extracted from the original 
display of genes upregulated in rat liver following 3-day treat- 
ment with Wy-14,643. Ten micorlitres of each PCR reaction 
were loaded per lane. 



of approximately 1.5% (3 bp per 200). In addi- 
tion, there may be two or more products that are 
the same size, but have a different sequence. 



Therefore some form of discrimination must be 
employed to isolate as many of these products as 
possible. HA-red screening (Geisinger et al., 1997) 
of a number of clones derived from each band 
provided a means to discriminate between differ- 
ent gene species of the same size. A typical exam- 
ple of such a gel is shown in Fig. 2. In total, 88 
and 48 apparently different clones were obtained 
from the final differential expression patterns of 
upregulated and downregulated rat genes, respec- 
tively. Sixty nine and 89 apparently different 
clones were obtained from the final differential 
expression patterns of the upregulated and down- 
regulated guinea pig genes, respectively. 

Having identified as many different candidate 
gene products as possible in the screening step I, a 
second screening step was carried out on every 
clone to confirm those that represented true dif- 
ferentially expressed genes. This is necessary since 
no subtraction technique is 100% efficient. The 
approach we used, termed PCR-select differential 
screening (as recommended in Clontech's PCR-se- 
lect cDNA subtraction kit protocol), utilises the 
forward and reverse subtractions as an aid to 
screening for the true differentially expressed 
genes (CLONTECHniques, 1997). Because these 
probes have already undergone subtraction, they 
have been enriched for differentially expressed 
genes and are therefore more sensitive than un- 
subtracted driver/tester cDNA probes for detect- 
ing true differential expression. All the clones that 
were isolated from each display were dotblotted 
and probed with the display from which they was 
obtained, plus the corresponding reverse-sub- 
tracted display. An example of such a blot is 
shown in Fig. 3. Clones corresponding to authen- 
tic differentially expressed mRNAs hybridised 
with the subtracted cDNA probe, but not the 
reverse-subtracted probe. We also included in the 
authentic positives, those clones that gave a sub- 
stantially greater signal with the subtracted probe 
compared to the reverse-subtracted probe. False 
positives hybridised with either both probes or 
with neither probe. Of the original 88 upregulated 
and 48 downregulated rat clones selected for this 
screening step, 28 (32%) and 15 (31%) respec- 
tively, were found to be true positives. In the rat, 
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! (100%) of the true positive upregulated genes 
able 2) and 1 1 (73%) of the true positive down- 
gulated genes (Table 3) were non-redundant. Of 
e original 69 upregulated and 89 downregulated 
linea pig clones selected for this screening step, 
! (70%) and 37 (42%) respectively, were found to 
; true positives. Thirty six (75%) of the upregu- 
ted genes (Table 4) and 33 (89%) of the down- 
gulated genes (Table 5) were non-redundant. 

2. Identification of clones 

On sequence analysis it was found that some 
Dnes were unsequencable in the first instance 
113 forward primer) due to long polyA runs 
at appeared to prematurely terminate the se- 
lencing reaction. These clones were therefore 
sequenced from the opposite direction using the 
13 reverse primer. Those xenobiotic-modulated 
ne products identified to date are listed in Ta- 
ts 2 and 3 (rat) and Tables 4 and 5 (guinea pig). 




. 3. Dot blots of clones of putative upregulated gene species 
ated from guinea pig liver following 3-day treatment with 
-14,643. Ail clones identified in the stage I screening step 
methods) were blotted and probed with (A) the differen- 
display from which they originated (control driving 
ted) and (B) the reverse subtraction (treated driving con- 
). Arrows indicate some of the true differentially expressed 
les. 



Table 2 

Identification of genes that were upregulated in male rat liver 
following 3-day treatment with WY-14,64} 



FASTA-EMBL gene Accession No. Sequence 
identification (rat un- homology* (%) 

less otherwise stated) 



Carnitine octanoyl 


RN26033 


99 


transferase 






NCI_CGAP_Lil (H. 


HS 1275949 


98 


sapiens) (EST b ) 






Peroxisomal enoyl 


RN08976 


98 


hydratase-like 






protein 






Liver fatty acid bind- 


V01235 


QA 
7u 


ing protein 






Soares mouse 


AA038051 


7U 


p3NMF19.5 M. 






musculus cDNA 






clone 






Cytochrome 


RNCYPT A 


7*4 


p450IVAl 








I'll l vi \j \^ \J r\ 




methylglutaryl 






CoA synthase 






Rabgeranylgeranyl 


RNRABGERA 


94 


transferase compo- 






nent B 






Genes for 18S, 5.8S, 


RNRRNA 


94 


and 28S ribosomal 






RNAs 






Carnitine acetyl 


MMRNACAR 


92 


transferase (mouse) 






Soares mouse NML 


MM1157113 


92 


(EST) 






Bone marrow stromal 


AA545726 


92 


fibroblast (H. sapi- 






ens) cDNA clone 






HBMSF2E4 (EST) 






7.5dpc embryo 


AA408192 


92 


(mouse) (EST) 






Alpha- 1 -macroglobu- 


RNALPH1M 


91 


lin 






Transferrin 


RNTRANSA 


91 


Lecithinxholesterol 


RNU62803 


90 


acyltransferase 






Zn-a2-glycoprotein 


RN2A2GA 


90 


Serum albumin 


RNJALBM 


89 


Fructose- 1 ,6-bisphos- 


RNFBP 


88 


phate 1-phospho- 






hydrolase 






Soares mouse 


A A 124706 


88 


melanoma (EST) 






(S c ) 






Soares mouse 


A A 154039 


88 



3NbMS (EST) 
(AS C ) 
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Table 2 (Continued) 



FASTA-EMBL gene 
identification (rat un- 
less otherwise stated) 



Accession No. 



Sequence 
homology 0 (%) 



17-P-hydroxsteroid de- 
hydrogenase . 

Scares mouse 
p3NMF19.5 (EST) 

Peroxisomal enoyl- 
CoA:hydratase -3- 
hydroxyacyl CoA 
Afunctional enzyme 

Integral membrane 
protein, TAPA-1 
(CD81) (mouse) 

Soares mouse lymph 
node (EST) 

H. sapiens (clone 
zapl28) mRNA 

Lysophospholipase ho- 
mologue (human) 

Soares mouse lymph 
node (EST) 



RN17BHDT2 

AA03805! 

RNPECOA 

S45012 

MMAA88445 
L40401 
HSU67963 
AA2 17044 



87 
87 
85 

81 

81 
76 
76 
74 



0 Refers to the nucleotide sequence homology between the 
cloned band isolated from the differential display and the 
corresponding gene derived from the EMBL gene sequence 
bank. 

b EST is 'expressed sequence tag' — a gene of as yet 
unknown identity and function. 

c Where sequence homologies were equal in both directions 
of the isolated band, both the sense (S) and antisense (A) 
identities are given. 



Table 3 

Identification of genes that were downregulated in male rat 
liver following 3-day treatment with Wy-14,643 



FAST-EM BL gene 
identification (rat un- 
less otherwise stated) 



Accession No. 



Sequence 
homology* (%) 



NCI CGAP_Lil (H. 
sapiens) (EST b )(S c ) 

NCI_CGAP_Prl (H. 
sapiens) (EST)(AS C ) 

UDP-glucuronosyl- 
transferase 
(UGT2B12) 

Complement compo- 
nent c3 

Soares mouse pla- 
centa (S) 

Ape (chimpanzee) 28S 
rRNA (AS) 

Rat CYP2C11 

Ribosomal protein S5 

Transthyretin 

Contrapsin-like 
protease inhibitor 

Prostaglandin F2a (S) 

|3-2-microglobulin 
(AS) 

Apolipoprotein C-III 
Parathymosin-alpha 

(zinc2* -binding 

protein) 



AA484528 
AA469320 
RN06273 

RNC3 

AA023305 

PTRGMC 

RNCYPM1 
RNRPS5 
RNTTHY 
RNCCP23 

RN26663 
RNB2MR 

RNAPOA02 
RNUZNBP 



99 
99 
98 

96 

96 

96 

95 
94 
94 
89 

84 
84 

82 
75 



H Refers to the nucleotide sequence homology between the 
cloned band isolated from the differential display and the 
corresponding gene derived from the EMBL gene sequence 
bank. 

b EST is 'expressed sequence tag' — a gene of as yet 
unknown identity and function. 

c Where sequence homologies were equal in both directions, 
both the sense (S) and antisense (A) identities are given. 



In all cases, both the forward and reverse se- 
quence of the target clones were analysed and the 
gene having the highest statistical homology 
noted: 

3 J. RT-PCR analysis of selected clones 

The results of a typical RT-PCR semi-quantita- 
tion experiment for transferrin in the rat is given 
in Fig. 4 and the results for a total of 12 selected 
genes in both the rat and guinea pig are shown in 
Table 6. 



4. Discussion 

It is now apparent that all cancers arise from 
accumulated genetic changes within the cell. Al- 
though documenting and explaining these changes 
presents a formidable obstacle to understanding 
the different mechanisms of carcinogenesis, the 
experimental methodology is now available to 
begin attempting this difficult challenge. In order 
to begin the elucidation of the molecular mecha- 
nisms involved in non-genotoxic hepatocarcino- 
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iesis, we have used SSH to identify a number of 
les that are upregulated or downregulated in 
lie rat and guinea pig livers following short 
m exposure to the PP, Wy- 14,643. We have 
;d the rat model to represent a species suscepti- 
: to the non-genotoxic carcinogenic effect of 
s and the guinea pig as a resistant species 
rton et ah, 1984; Rodricks and Turnbull, 1987; 



Lake et ah, 1989; Makowska et ah, 1992; Lake et 
ah, 1993). 

Gurskaya et ah (1996), who originally devel- 
oped the SSH technique, cloned the products of 
the secondary PCR reaction and screened a small 
number of randomly selected colonies for differ- 
entially expressed clones using northern hybridisa- 
tion. However, we decided against this approach 



>le 4 

ntification of genes that were upregulated in male guinea pig liver following 3-day treatment with WY-14,643 



STA-EMBL gene identification (guinea pig unless otherwise stated) Accession No. Sequence 

homology" (%) 



"boxylesterase 




07 


nplement protein (UrLJj 


M J4lo4 


97 


osoljc aioenyae aenyarogenase vanccpj 


T 1 1 77A1 
U 1 I /Oi 


92 


alase (human) 


AU4U /O 


89 


ocnononaj aspartate diiiiiioiransitiaac yyigj 


\A 1 1 717 


on 
89 


nganon iacior-i-aipiia \iauuiij 




oo 
00 


T rTiAP Rr7 ff vnnipns fDNA clone (ESTl (Similar t o rhirk mit nhncnhnpnnlnvni. 




O / 


aie carDOAyKindic^ 






Via- 1 -antinrntpina^p S 


M57270 


R1 
0 J 


Virmvltetrahvdro folate dehvdrocenase (rat) 


M59861 

JVJI J70U i 




ft<inmal nrotein L6 (rat) 


X87107 




res pregnant uterus Nb (EST) (mouse) 


A A 156847 


83 


ochondrial citrate transport protein (human) 


L77567 


80 


oplasmic chaperonin hTRiC5 (human) 


UI7104 


80 


■ha-l-antiproteinase F 


M57271 


77 


erogeneous nuclear ribonuclearprotein cl/c2 (human) 


D28382 


77 


res parathyroid tumour (EST) (similar to human serum albumin precursor) 


AA860651 


76 


itagene mouse kidney (EST) 


AA107327 


75 


res parathyroid tumour NbHPA human cDNA (EST) 


AA860653 


74 


res mouse mammary gland (EST) 


AA6 19297 


74 


NA clone 15 004 (EST) (human) 


H01826 


74 


res senescent fibroblasts (EST) (mouse) 


W52190 


74 


proalbumin (human) 


E04315 


72 ' 


NA clone 73 169 (EST) (human) 


T56624 


72 


tmin D-binding protein (human) 


L10641 


71 


>H gene (exon 8) (human) 


Yl 1498 


71 


IL flow sorted chromosome 


B05457 


71 


res foetal liver spleen (EST) (mouse) 


AA009524 


71 


res foetal heart NbMH19W (EST) (mouse) 


AA009421 


69 


res foetal heart NbHH19W H. sapiens cDNA clone (EST) 


W94377 


67 


nylalanine hydroxylase (human) 


U49897 


67 


line-5-carboxylate dehydrogenase (human) 


U24266 


66 


tathione-S-transferase homoiogue (human) 


U90313 


65 


LCGAP.GCBI (EST) (human) 


AA769294 


65 


tective protein (human) 


M22960 


64 


;ie 27 375 (EST) (human) 


N37046 


62 


tagene colon ( # 937 204) H. sapiens cDNA clone (EST) 


AA149777 


62 



Refers to the nucleotide sequence homology between the cloned band isolated from the differential display and the correspond- 
gene derived from the EMBL gene sequence bank. 
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.Table 5 

Identification of genes that were downregulated in male guinea 
pig liver following 3-day treatment with WY- 14,643 

FASTA-EMBL gene Accession No. Sequence 
• identification (guinea homology 8 (%) 

pig unless otherwise 
stated) 



Complement C3 


M34054 


97 


. protein 






Murinoglobulin 


D84339 


95 


Alpha- 1-an- 


M57271 


88 


tiproteinase F 


X62245 




Elongation factor-al- 


89 


pha- 1 (rabbit) 






Coupling protein G 


X04409 


88 


(human) 






NCI_CGAP_Ovl 


AA586309 


87 


(EST b ) (human) 






Lecithin xholesterol 


D 13668 


85 


acetyl transferase 






(rabbit) 






Aldolase B (human! 


X00270 


84 


Anti-thrombin III 


E00116 


80 


(human) 






Phenylalanine hy- 


K03020 


80 


droxylase (human) 






I nter-ot- trypsin in- 


D38595 


79 


hibitor (human) 






Normalised rat mus- 


AA849753 


78 


cle (EST) (S c ) 






Normalised rat ovary 


AA801059 


78 


(EST) (AS C ) 






Complement factor 


X00284 


77 


Ba fragment (hu- 






man) 






Dihydrodiol dehydro- 


U05598 


76 


genase (human) 






Spot 14 gene (thyroid- 


Y08409 


75 


inducible hepatic 






protein)(human) 






BAC clone 174pl2 


AC004236 


75 


(human) 






Mitochondrial alde- 


X05409 


74 


hyde dehydroge- 






nase (human) 






Preproalbumin (hu- 


E04315 


74 


man) 






NCI_CGAP_Pr9 


AA533142 


74 


(EST) (human) (S) 






Normalised rat pla- 


AA851197 


74 


centa (EST) (AS) 






Heparin sulfate pro- 


J04621 


73 


teoglycan (human) 






cDNA clone 33 992 


R24330 


73 


(EST) (human) 







Table 5 (Continued) 



FASTA-EMBL gene Accession No. 


Sequence 


identification (guinea 


homology* (%) 


pig unless otherwise 




stated) 




Retinol dehydrogenase U33501 


71 


(rat) 




TAPA-I integral mem- S45012 


71 


brane protein 




(CD81) (mouse) 




Complement compo- M35525 


70 


nent c5s 




Apolipoprotein B (pig) LI 1235 


69 


cDNA clone 143 918 R76742 


68 


(EST) (human) 




a-fibrinogen (human) K02569 


68 


Soares foetal liver, W03726 


68 


spleen INF (mouse) 




Barstead bowel (EST) AA232049 


67 


(mouse) 




UDP glucuronosyl AF0309137 


66 


transferase (cat) 




Myeloid leukaemia cell L08246 


65 


differentiation 




protein (MCL-1) 




(human) (S) 




STS SHGC-34 987 (hu-G27984 


65 


manj y/\o) 




Soares mouse AA222798 


64 


3NME125 




Stratagene mouse em- AA 199420 


64 


bryonic (EST) (S) 




Rad 52 (mouse) AF004854 


63 



a Refers to the nucleotide sequence homology between the 
cloned band isolated from the differential display and the 
corresponding gene derived from the EMBL gene sequence 
bank. 

b EST is 'expressed sequence tag' — a gene of as yet 
unknown identity and function 

c Where sequence homologies were equal in both directions, 
boththe sense (S) and antisense (A) identities are given. 

for several reasons: (1) the kinetics of ligation and 
transformation favour the isolation of smaller 
PCR products, thereby producing a misrepresen- 
tation of larger gene products; (2) northern blot 
analysis is notoriously insensitive and is unlikely 
to confirm expression of rare transcripts; (3) there 
is no measurable end point to the screening of 
clones produced in this way other than to analyse 
every transformed colony. We used instead an 
alternative approach; after running out the differ- 



4 
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:ntial display on a high-resolution agarose gel 
Fig. 1) and overstating with SYBR Green I to 
:iihance visualisation, the composite bands were 
ndividually extracted, reamplified and cloned, 
lowever, it has been well documented that single 
>ands from differential displays often contain a 
leterogeneous mixture of different products 
Mathieu-Daude et al'., 1996; Smith et al., 1997). 
This is because polyacrylamide gels cannot dis- 
riminate between DNA sequences that differ in 
ize by less than about 0.2% (Sambrook et al., 
989). High-resolution agarose gels such as those 
tsed in this work are even less sensitive, normally 
>nly discriminating products that differ in size by 
t least 1.5%. The use of the HA-red screening 
tep enables resolution of identical or nearly iden- 
ical sequences based on their AT content (Wawer 
t al., 1995) and is sensitive down to < 1% differ- 
nce. Furthermore, it is rapid, technically simple 
nd does not require the use of radiolabels. 
jeisinger et al. (1997) originally demonstrated the 
isefulness of using HA-red to identify different 
products cloned from the same band of an RNA 
iifferential display experiment by simultaneously 
unning them in normal agarose (to discriminate 
»y size) and in normal agarose containing HA-red 
to discriminate by AT content). We have found 
hat this approach is equally useful for identifying 
ifferent gene species cloned from the same band 
f our SSH display. 

Diatchenko et al. (1996) reported that SSH is 
lighly efficient at producing differentially ex- 
iressed gene species. However, we also included a 
econd screening step to further confirm that the 
lones isolated from the differential display were 
ndeed differentially expressed. Duplicate dotblots 
»f the candidate clones were blotted with the 
iisplay from which they were originally isolated 
nd with the 'reverse subtraction' display. To 
nake the reverse-subtracted probe, the subtractive 
ybridisation step of the procedure was carried 
•m using the original tester cDNA as a driver, 
nd the original driver cDNA as a tester. In this 
/ay, clones that are false positives can be iden- 
ified through their presence in both blots. Such 
alse positives most commonly arise through hav- 
lg a very high abundance in the initial sample or 
nusual hybridisation properties (Li et al., 1994). 



Although the SSH method itself has been 
shown to be efficient, and despite the screening 
step that we included, there is an important caveat 
to bear in mind — namely that it is important 
that all clones be considered only as 'candidates' 
until the actual abundance of their mRNA is 
quantitated in treated and control samples. To- 
wards this end, we examined the expression of a 
limited number of clones using semi-quantitative 
RT-PCR. Albumin was used as the reference gene 
as we have previously found that the expression 
of this gene does not appear to change with the 
treatment regime that we used (Fig. 4, and data 
not shown). There are a number of interesting 
points to note from our results. The first is the 
presence of genes that serve as appropriate posi- 
tive controls in the upregulated and downregu- 
lated series. For example, in the rat it can be seen 
that CYP4AI expression increases 14-fold follow- 
ing treatment. Although CYP4AI mRNA expres- 
sion levels following WY- 14,643 treatment have 
not been previously reported in this model, the 
figure compares favourably with that recorded by 
Bell et al. (1991), who used RNAse-protection to 
quantitate CYP4A1 in rat liver following treat- 
ment with methylclofenapate, another PP. In ad- 
dition, we also confirmed that the peroxisomal 
enoyl-CoA:hydratase-3-hydroxyacyl-CoA Afunc- 
tional enzyme is also upregulated 9-fold, in agree- 
ment with the findings of Chen and Crane (1992). 

A number of genes were downregulated follow- 
ing Wy- 14,643 exposure, including CYP2C11 ex- 
pression. Corton et al. (1997) reported similar 
findings and suggested that this may in part ex- 
plain why male rats exposed to Wy- 14,643 and 
some other PPs have high serum estradiol levels, 
as estradiol is a substrate for CYP2C11. We have 
also shown that the expression of contrapsin-like 
protease inhibitor (CLPI) was downregulated by 
Wy- 14,643. This has not previously been reported, 
and we suggest that it may be linked to a require- 
ment for increased availability of amino acids to 
accommodate the hepatomegaly induced by treat- 
ment. Although little is known of the function of 
parathymosin-a, (zinc 2 + -binding protein) it has 
been shown to interact with the globular domain 
of histone HI, suggesting a role in histone func- 
tion (Kondili et al., 1996). In contrast to the 
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Albumin 
Transferrin 
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Fig. 4. Semi-quantitative RT-PCR experiment showing relative decrease in expression of transferrin in treated rat liver (RT-1 to 
RT-4) compared to controls (RC-1 to RC-3). An equal amount of mRNA was used in each reaction (10 ng), and each sample was 
quantitated in triplicate to reduce the effects of inter-tube variation. N is negative control (no mRNA). Lane M is a 100 bp ladder 
and lane L is a 1 Kb DNA ladder. 



downregulation observed in this work, other stud- 
ies have shown that parathymosin-a expression is 
elevated in breast cancer (Tsitsilonis et al., 1993, 
1998), with the implication that parathymosin-a 
may somehow be involved in regulating cell pro- 
liferation by more than one mechanism. Transfer- 
rin has previously been shown to be 
downregulated in rat liver by hypolipidemic PPs 
(Hertz et al., 1996). It is therefore interesting to 
note that we isolated a clone identified as transfer- 
rin from the upregulated display profile. Since we 
confirmed by RT-PCR that transferrin is in fact 
downregulated in the rat (Fig. 4), we conclude 
that transferrin was either a false positive or was 
incorrectly identified. It could also be that we 
have isolated a close relative, splice variant or 
isoform of transferrin, which demonstrates a dif- 
ferent expression profile under these experimental 
conditions. Further investigations are therefore 



required to determine which of these possibilities 
are correct. 

One of our most intriguing observations was 
that one gene, CD81, appeared to be upregulated 
in rat liver but downregulated in guinea pig liver 
following Wy- 14,643 exposure. CD81 is a widely 
expressed cell surface protein that is involved in a 
large number of cellular functions, including ad- 
hesion, activation, proliferation and differentia- 
tion (reviewed by Levy et al., 1998). Since all of 
these functions are altered to some extent in car- 
cinogenesis, it is perhaps an important observa- 
tion that CD81 expression is differentially 
regulated in a resistant and sensitive species ex- 
posed to a non-genotoxic carcinogen. 

Albumin and ribosomal genes appear common 
to all differential displays and are thus undesir- 
able false positives. However, due to their high 
expression in the liver, they are difficult to re- 
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love. We also noted a number of gene species, 
articularly in the guinea pig, which were com- 
ion to both upregulated and downregulated 
rofiles. Again, the most likely reason for these 
aving arisen is their high abundance. 
A relatively large number of upregulated and 
ownregulated genes were isolated from guinea 
ig liver following Wy-14,643 exposure. However, 
ie guinea pig genome has been relatively poorly 
haracterised and so many of the clones were 
lentified as resembling genes or ESTs from other 
secies. Without full-length sequence data it is 
ifficult to ascertain the accuracy of the assigned 
lentities and this must be borne in mind when 
tilising data such as this, for example, in design- 
lg effective primers for RT-PCR studies. Al- 
lough the actual isolated clone sequences can be 
sed to do this, their relatively small size often 
jstricts the ability to design effective primers. In 
ddition, as we observed with transferrin, using a 
ublished full-length sequence may help to iden- 
fy false positives. 



By comparing the expression profiles of genes 
showing altered expression in a PP-sensitive spe- 
cies (rat) with a PP-resistant species (guinea pig), 
it was our aim to identify genes that are mecha- 
nistically relevant to the non-genotoxic hepatocar- 
cinogenic action of Wy-14,643. However, few of 
the genes that we have isolated were common to 
both the rat and the guinea pig. This suggests 
either that the molecular mechanisms of response 
in these two species are so different that few genes 
are commonly regulated in response to Wy-14,643 
exposure, or that we have recovered only a small 
proportion of those genes that have altered ex- 
pression. The latter seems the more likely scenario 
since it is perceived that one of the main problems 
of subtractive hybridisation and other differential 
expression technologies is the inability to consis- 
tently isolate rare gene transcripts (Bertioli et al., 
1995). This is potentially problematic in that 
weakly expressed genes may play an important 
role in regulating key cellular processes, and that 
the majority of mRNA species are classified as 
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imi-quantitative RT-PCR analysis of selected gene species in the rat and guinea pig* 

ranscript Putative change of expression following Change according to RT-PCR 

treatment according to dotblot quantitation 



Rat Guinea pig Rat Guinea pig 
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N/A 


N/A 


No change 


No change 


ifunctional enzyme 


Up 


N/A 


Upregulated* (9 x ) 


N/O 


YP2C11 


Down 


N/A 


Downregulated* 


N/D 
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YP4A1 


Up 


N/A 


Upregulated* (14 x) 


N/D 


atalase 


N/A 


Up 


No change 


N/O 


D81 (TAPA-1) 


Up 


Down 


N/O 


Upregulated**(1.4 


ontrapsin-like protease inhibitor 


Down 


N/A 


Downregulated** 


x) 
N/D 








(0.5 x) 




arathymosin-a (zinc 2 * binding 


Down 


N/A 


Downregulated** 


N/D 


protein) 






(0.6 x) 




ransferrin 


Up 


N/A 


Downregulated* 


No change 








(0.5 x) " 




DP-Glucuronosyl transferase 


Down 


N/A 


Downregulated** 


N/O 








(0.2 x) 




ownUnknown-1 


Down 


N/A 


No change (P - 0.06) 


N/D 


.i-a2-glycoprotein 


Up 


N/A 


No change 


N/O 



a N/A. not applicable; N/O. not optimised; N/D, not done. 
* P< 0.0005; 
** P<0.05. 
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•'rare' in abundance (Bertioli et al., 1995). How- 

• ever, in their original paper describing the SSH 
technique, Gurskaya et al. (1996) demonstrated 
that SSH can enrich rare molecules between 1000- 

• and 5000-fold in a single round of hybridisation. 
Unfortunately, due to high background smearing 
in our initial experiments (which hindered identifi- 
cation of single bands), we were compelled to 
reduce the primary hybridisation time to only 4 h 
— a step that theoretically is likely to reduce the 
number of rare sequences (CLONTECHniques, 
1996). Furthermore, it has been claimed by the 
manufacturers that, whilst this technique can 
identify changes as small as 1.5-fold between the 
driver and tester populations, it is best suited to 
the isolation of genes that show a greater than 
5-fold increase (CLONTECHniques, 1996). In ad- 
dition, where tester and driver contain genes with 
large and small differences in abundance, the SSH 
method will be biased towards identifying those 
genes with the large differences (CLONTECH- 
niques, 1996). Thus, it is most probable that we 
have not isolated all of the more rarely expressed 
transcripts and those demonstrating small changes 
in expression. 

One problem that remains is identifying the 
function of genes isolated in SSH experiments as 
described herein, some of which may be crucial to 
the process of carcinogenesis, and are, to date, 
unidentified. However, we have provided evidence 
herein that SSH can be used to begin the process 
of characterising the extent and importance of 
altered gene expression in response to a chemical 
stimulus. The developments of this approach 
should include characterisation of temporal and 
dose responses, and functional analysis studies 
including knockout mice. In combination, such 
studies should make a significant contribution to 
our understanding of the molecular mechanisms 
of action and physiological relevance of gene reg- 
ulation in non-genotoxic hepatocarcinogenesis. It 
should then be. possible to ascertain whether dif- 
ferentially expressed genes are causally or casually 
related to the chemical-induced toxicity, and 
therefore a substantial mechanistic advance. 

It is clear that there are also broader applica- 
tions for this experimental approach that go be- 
yond understanding the molecular mechanisms of 



peroxisome-proliferator induced non-genotoxic 
hepatocarcinogenesis in rodents. The potential 
medical and therapeutic benefits of elucidating the 
molecular changes that occur in any given cell in 
progressing from the normal 4o the carcinogenic 
(or other diseased, abnormal or developmental) 
state are very substantial. Notwithstanding the 
lack of complete functional identification of al- 
tered gene expression, such gene profiling studies 
described herein essentially provides a 'fingerprint' 
of each stage of carcinogenesis, and should help in 
the elucidation of specific and sensitive biomark- 
ers for different types of cancer. Amongst other 
benefits, such fingerprints and biomarkers could 
help uncover differences in histologically identical 
cancers, and provide diagnostic tests for the earli- 
est stages of neoplasia. In addition, the genes 
identified by this approach may be incorporated 
into gene-chip DNA-arrays, thus providing a 
standard genetic fingerprint for a particular toxin 
treatment in a particular species. Interrogation of 
these gene arrays for an unknown compound that 
has a similar pattern to the known reference 
chemical would then provide evidence that the 
unknown may have a toxicity profile similar to 
the 'standard' fingerprint, thereby serving as a 
mechanistically relevant platform for further de- 
tailed investigations. 
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ABSTRACT We have developed high-density DNA mi- 
croarrays of yeast ORFs. These microarrays can monitor 
hybridization to ORFs for applications such as quantitative 
differentia] gene expression analysis and screening for se- 
quence polymorphisms. Automated scripts retrieved sequence 
information from public databases to locate predicted ORFs 
and select appropriate primers for amplification. The primers 
were used to amplify yeast ORFs in 96-well plates, and the 
resulting products were arrayed using an automated micro 
arraying device. Arrays containing up to 2,479 yeast ORFs 
were printed on a single slide. The hybridization of fluores- 
cently labeled samples to the array were detected and quan- 
titated with a laser confocal scanning microscope. Applica- 
tions of the microarrays are shown for genetic and gene 
expression analysis at the whole genome level. 



The genome sequencing projects have generated and will con- 
tinue to generate enormous amounts of sequence data. The 
genomes of Saccharomyces cerevisiae, Haemophilus influenzae (1); 
Mycoplasma genitalium (2), and Methanococcus jannischii (3) 
have been completely sequenced. Other model organisms have 
had substantial portions of their genomes sequenced as well 
including the nematode Caenorhabditis elegans (4) and the small 
flowering plant Arabidopsis thaliana (5). Given this ever- 
increasing amount of sequence information, new strategies are 
necessary to efficiently pursue the next phase of the genome 
projects — the elucidation of gene expression patterns and gene 
product function on a whole genome scale. 

One important use of genome sequence data is to attempt 
to identify the functions of predicted ORFs within the genome. 
Many of the ORFs identified in the yeast genome sequence 
were not identified in decades of genetic studies and have no 
significant homology to previously identified sequences in the 
database. In addition, even in cases where ORFs have signif- 
icant homology to sequences in the database, or have known 
sequence motifs (e.g., protein kinase), this is not sufficient to 
determine the actual biological role of the gene product. 
Experimental analysis must be performed to thoroughly un- 
derstand the biological function of a given ORF's product. 
Model organisms, such as S. cerevisiae , will be extremely 
important in improving our understanding of other more 
complex and less manipulable organisms. 

To examine in detail the functional role of individual ORFs and 
relationships between genes at the expression level, this work 
describes the use of genome sequence information to study large 
numbers of genes efficiently and systematically. The procedure 
was as follows, (i) Software scripts scanned annotated sequence 
information from public databases for predicted ORFs. (/i) The 
start and stop position of each identified ORF was extracted 
automatically, along with the sequence data of the ORF and 200 
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bases flanking either side. (Hi) These data were used to automat- 
ically select PCR primers that would amplify the ORF. (/V) The 
primer sequences were automatically input into the automated 
multiplex oligonucleotide synthesizer (6). (v) The oligonucleo- 
tides were synthesized in 96-well format, and (vi) used in 96-well 
format to amplify the desired ORFs from a genomic DNA 
template. (vii) The products were arrayed using a high-density 
DNA arrayer (7-10). The gene arrays can be used for hybridiza- 
tion with a variety of labeled products such as cDNA for gene 
expression analysis or genomic DNA for strain comparisons, and 
genomic mismatch scanning purified DNA for genotyping (11). 

METHODS 

Script Design. All scripts were written in UNIX Tool Command 
Language. Annotated sequence information from GenBank was 
extracted into one file containing the complete nucleotide se- 
quence of a single chromosome. A second file contained the 
assigned ORF name followed by the start and stop positions of that 
ORF. The actual sequence contained within the specified range, 
along with 200 bases of sequence flanking both sides, was extracted 
and input into the primer selection program primer 0.5 (White- 
head Institute, Boston). Primers were designed so as to allow 
amplification of entire ORFs. The selected primer sequences were 
read by the 96-well automated multiplex oligonucleotide synthe- 
sizer instrument for primer synthesis. The forward and reverse 
primers were synthesized in two separate 96-well plates in corre- 
sponding wells. All primers were synthesized on a 20-nmoI scale. 

ORF Amplification and Purification. Genomic DNA was iso- 
lated as described (12) and used as template for the amplification 
reactions. Each PCR was done in a total volume of 100 pd. A total 
of 0.2 each of forward and reverse primers were aliquoted into 
a 96-well PCR plate (Robbins Scientific, Sunnyvale, CA); a master 
mix containing 0.24 mM each dNTP, 10 mM Tris (pH 8.5), 50 mM 
MgCh, 2.5 units Taq polymerase, and 10 ng of template was added 
to the primers, and the entire mix was thermal cycled for 30 cycles 
as follows: 15 min at 94°C, 15 min at 54°C, and 30 min at 72°C. 
Products were ethanol precipitated in polystyrene v-bottom 96- 
well plates (Costar). All samples were dried and stored at -20°C 

Arraying Procedure and Processing. Microarrays were 
made as described (8). 

A custom built arraying robot was used to print batches of 48 
slides. The robot utilizes four printing tips which simultaneously 
pick up «*1 of solution from 96-well microtiter plates. After 
printing, the microarrays were rehydrated for 30 sec in a humid 
chamber and then snap dried for 2 sec on a hot plate (100°C). The 
DNA was then U V crosslinked to the surface by subjecting the 
slides to 60 millijoules of energy. The rest of the poly-L-Iysine 
surface was blocked by a 15-min incubation in a solution of 70 mM 
succinic anhydride dissolved in a solution consisting of 315 ml of 
l-methyl-2-pyrrolidinone (Aldrich) and 35 ml of 1 M boric acid 
(pH 8.0). Directly after the blocking reaction, the bound DNA 
was denatured by a 2-min incubation in distilled water at «*95°C 



Abbreviation: YEP, yeast extract/peptone. 

tTo whom reprint requests should be sent at the present address: 
Synteni, Inc., 6519 Dumbarton Circle, Fremont, CA 94555. 
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Galactose vs. Cluco^ J 



Fig. 1. Two-color fluorescent scan of a yeast microarray contain- 
ing 2,479 elements (ORFs). The center-to-center distance between 
elements is 345 p,m. A probe mixture consisting of cDNA from yeast 
extract/peptone (YEP) galactose (green pseudocolor) and YEP glu- 
cose (red pseudocolor) grown yeast cultures was hybridized to the 
array. Intensity per element corresponds to ORF expression, and 
pseudocolor per element corresponds to relative ORF expression 
between the two cultures. 

The slides were then transferred into a bath of 100% ethanol at 
room temperature. 

Probe Preparation: cDNA. Yeast cultures (100 ml) were grown 
to «*1 OD A 600 and total RNA was isolated as described (13). Up 
to 500 \i% total RNA was used to isolate mRNA (Qiagen, 
Chatsworth, CA). Oligo(dT)20 (5 fig) was added and annealed to 
2 /xg of mRNA by heating the reaction to 70°C for 10 min and 
quick chilling on ice, plus 2 ptl Superscript II (200 units/^.1) (Life 
Technologies, Gaithersburg, MD), 0.6 /xl 50x dNTP mix (final 
concentrations were 500 /xM dATP, dCTP, dGTP, and 200 /xM 
dTTP), 6 til 5X reaction buffer, and 60 Cy3-dUTP or 
Cy5-dUTP (Amersham). Reactions were carried out at 42°C for 
2 h, after which the mRNA was degraded by the addition of 0.3 
fx\5U NaOH and 03 /xl 100 mM EDTA and heating to 65°C for 
10 min. The sample was then diluted to 500 ttl with TE and 
concentrated using a Microcon-30 (Amicon) to 10 /xl 

Probe Preparation: Genomic DNA. Fluorescent DNA was 
prepared from total genomic DNA as follows: 1 /xg of random 
nonamer oligonucleotides was added to 2.5 /xg of genomic 
DNA. This mixture was boiled for 2 min and then chilled on 
ice. A reaction mixture containing dNTPs (25 /xM dATP, 
dCTP, dGTP, 10 jxM dTTP, and 40 jxM Cy3-dUTP or 
Cy5-dUTP) reaction buffer (New England Biolabs), and 20 
units exonuclease free Klenow enzyme (United States Bio- 
chemical) was added, and the reaction was incubated at 37°C 
for 2 h. The sample was then diluted to 500 til with TE and 
concentrated using a Microcon-30 (Amicon) to 10 til. 

Hybridization. Purified, labeled probe was resuspended in 11 
/xl of 3.5 X SSC containing 10 fig Escherichia colt tRNA, and 0.3% 
SDS. The sample was then heated for 2 min in boiling water, 
cooled rapidly to room temperature, and applied to the array. The 
array was placed in a sealed, humidified, hybridization chamber. 
Hybridization was carried out for 10 h in a 62°C water bath, after 
which the arrays were washed immediately in 2x SSC/0.2% SDS. 
A second wash was performed in 0.1 x SSC. 

Analysis and Quantitation. Arrays were scanned on a 
scanning laser fluorescence microscope developed by Steve 
Smith with software written by Noam Ziv (Stanford Univer- 
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sity). A separate scan was done for each of the two fluoro- 
phores used. The images were then combined for analysis. A 
bounding box, fitted to the size of the DNA spots, was placed 
over each array element. The average fluorescent intensity was 
calculated by summing the intensities of each pixel present in 
a bounding box and then dividing by the total number of pixels. 
Local area background was calculated for each array element 
by determining the average fluorescent intensity at the edge of 
the bounding box. To normalize for f luorophore-specific vari- 
ation, control spots containing yeast genomic DNA were 
applied to each quadrant during the arraying process. These 
elements were quantitated and the ratios of the signals were 
determined. These ratios were then used to normalize the 
photomultiplier sensitivity settings such that the ratios of the 
fluorescence of the genomic DNA spots were close to a value 
of 1.0. The average signal intensity at any given spot was 
regarded as significant if it was at least two standard deviations 
above background. Each experiment was conducted in dupli- 
cate, with the fluorophores representing each channel re- 
versed. The ratios presented here are the average of the two 
experiments, except in the case in which the signal for the 
element in question was below the reliability threshold. The 
reliability threshold also determined the dynamic range of the 
experiment. For all of the experiments presented, the average 
dynamic range was ^1 to 100. In the case where the fluores- 
cence from a very bright spot saturates the detector, differ- 
ential ratios will, in general, be underestimated. This can be 
compensated for by scanning at a lower overall sensitivity. 

RESULTS 

The accumulation of sequence information from model organ- 
isms presents an enormous opportunity and challenge to under- 
stand the biological function of many previously uncharacterized 
genes. To do this accurately and efficiently, a directed strategy 
was developed that enables the monitoring of multiple genes 
simultaneously. Microarraying technology provides a method by 
which DNA can be attached to a glass surface in a high-density 
format (8). In practice, it is possible to array over 6,000 elements 
in an area less than 1.8 cm 2 . Given that the yeast genome consists 
of «6,100 ORFs, the entire set of yeast genes can be spotted onto 
a single glass slide. 

With this capability and the availability of the entire se- 
quence of the yeast genome, our strategy was to use a directed 
approach for generating the complete genome array. This 
procedure involved synthesizing a pair of oligonucleotide 
primers to amplify each ORF. The PCR product containing 
each gene of interest was arrayed onto glass and used, for 
example, as probe for monitoring gene expression levels by 
hybridizing to the array labeled cDNA generated from isolated 
mRNA of a culture grown under any experimental condition. 

Primer Selection and Synthesis. The primer selection was fully 
automated using Tool Command Language scripts and PRIMER 
05. (Whitehead). Primer pairs were automatically selected suc- 
cessfully for >99% of the ORFs tested. Primer sequences can thus 
be selected rapidly with minimal manual processing. A complete 
set of forward and reverse primers were selected initially for each 
ORF on chromosomes I, II, III, V, VI, VIII, IX, X, and XI. 
Primers for a representative set of ORFs (15% coverage) were 
chosen for the remaining chromosomes. With the release of the 
entire yeast genome sequence, the complete set of primers has 
now been selected. 

Because each ORF requires a unique pair of synthetic primers, 
a total of approximately 12,200 oligonucleotides will be required 
to individually amplify each target. This costly component was 
addressed with the automated multiplex oligonucleotide synthe- 
sizer (6) which efficiently synthesizes primers in a 96-well format. 
Each primer, synthesized on a 20-nmol scale, provides enough 
material for 100 amplification reactions, whereas a given PCR 
product provides enough material to generate an element on 
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Table 1. Heat shock vs. control expression data 

Ratio of 
gene expression 



Control 


Heat 


ORF 


Gene 


Description 




2.2 


YLR142 


PUT1 


Proline oxidase 




2.0 


YOL140 


ARG8 


Acetylornithine aminotransferase 


2.3 




YGL148 


AR02 


Chorismate synthase 




36.0 


YFL014 


HSP12 


Heat shock protein 




27.4 


YBR072 


HSP26 


Heat shock protein 




6.7 


YBR054 


YR02 


Similarity to HSP30 heat shock protein Yrolp 




3.4 


YCR021 


HSP30 


Heat ^hnelc nrntein 




2.6 


YER103 


SSA4 


Heat shock protein 




2.5 


YLR259 


HSP60 


MitorhnnHrinl hpat <;hnrk nrofpin H^P^O 




2.1 


YBR169 


SSE2 


Heat shock nrntein of the H55P70 familv 




1.7 


YBL075 


SSA3 


Cytoplasmic heat shock protein 




1.4 


YPL240 


HSP82 


Heat shock protein 




1.4 


YDR258 


HSP78 


Mitochondrial heat shock protein of clpb family of ATP-dependent proteases 


1.0 




YNL007 


SIS1 


Heat shock protein 


1.1 




YEL030 




70-kRa heat shnrk nrntein 


1.9 




YHR064 




Heat shock nrntein 




1.3 


YBL008 


HIR1 


Histone transcrintinn reonlatnr 


2.6 




YBL002 


HTB2 


Histone H2B 2 


3.3 




YBL003 


HTA2 


Histone H2A.2 


3.3 




YBR010 


HHT1 


Histone H3 


3.9 




YBR009 


HHF1 


Histone H4 




2.4 


YDR343 


HXT6 


High-affinity hexose transporter 




2.1 


YHR092 


HXT4 


Moderate- to low-affinitv plucose transnorter 


3.6 




YAR071 


PHOll 


Secreted acid phosphatase, 56 kOa isozyme 




2.3 


YLR096 


KJN2 


Ser/Thr protein kinase 


2.5 




YER102 


RPS8B 


Ribosomal protein S8.e 


2.6 




YBR181 


RPS101 


Ribosomal protein S6.e 


2.6 




YCR031 


CRY1 


40S ribosomal protein S14.e 


2.7 




YLR441 


RP10A 


Ribosomal protein S3.a.e 


2.8 




YHR141 


RPL41B 


Ribosomal protein L36a.e 


2.8 




YBL072 


RPS8A 


Ribosomal protein S8.e 


2.8 




YHL015 


URP2 


Ribosomal protein 


2.8 




YBR191 


URP1 


Ribosomal protein L21.e 


3.1 




YLR340 


RPLAO 


Acidic Ribosomal protein LlO.e 


3.3 




YGL123 


SUP44 


Ribosomal protein 




5.8 


YLR194 




Hypothetical protein 



500-1,000 arrays. Thus, a single primer pair provides enough 
starting material for up to «50,000 arrays. 

Primers were synthesized to amplify yeast ORFs. Primer 
synthesis had a failure rate of <1% in over 18 plates of 
synthesis as determined by standard trityl analysis (6). The 
success rate of the PCR amplifications using the primer pairs 
was 94% based on agarose gel analysis of each PCR. The 
purified PCR products were used to generate arrays. Two 
versions of the arrays were created for the experimental results 
presented here. The first array contained 2,287 elements and 
the second array batch contained 2,479 elements. 

Genome Arrays. The amplified ORFs were arrayed onto glass 
at a spacing of 345 microns (Fig. 1). The high-density spacing of 
DNA samples allows the hybridization volumes to be mini- 
mized — volumes are a maximum of 10 /d. The labeled probe can 
thus be maintained at relatively high concentrations, making 1-2 
/xg of mRNA sufficient for analysis. This also obviates the need 
for a subsequent amplification step and thus avoids the risk of 
altering the relative ratios of different cDNA species in the 
sample. 

Genetic Analysis: Genomic Comparison of Unrelated Strains. 
Microarrays allow efficient comparison of the genomes of dif- 
ferent strains. Genomic DNA from Y55, an S. cerevisiae strain 
divergent from the reference strain S288c, was randomly labeled 
with Cy3-dUTP and hybridized simultaneously with the S288c 
DNA labeled with Cy5-dUTP. When a comparison between the 
hybridization of the DNA from the two strains was done, several 



elements gave relatively little or no signal above background from 
the Cy3 channel (data not shown). These include SGE1, 
ASP3A-D, YLR156, YLR159, YLR161, ENA2 (YDR039 is 
ENA2), and YCR105. These results imply that the regions 
containing these genes are extremely divergent, or all together 
deleted from the strain. Subsequent attempts to generate PCR 
products from SGE1, ENA2, and ASP3A using Y55 DNA failed. 
This result supports the conclusion that these genes are likely to 
be missing from the Y55 genome. It is interesting to note that at 
least two of the regions absent in the Y55 genome have been 
previously shown or suggested to be deleted in mutant laboratory 
strains (14-16). In particular, the Asp-3 region appears to be 
highly prone to being deleted (15, 16). 

These results indicate that gene arrays can be used to efficiently 
screen different strains of an organism for large deletion poly- 
morphisms. A single hybridization and scan will reveal differences 
based on differential hybridization to particular elements. It is 
reasonable to suppose that an equivalent number of genes are 
present in the Y55 genome arid absent in the S288c genome. This 
result should be viewed as a minimum estimate of the deletion 
polymorphisms that exist between these two unrelated strains as 
intergenic deletions or small intragenic deletions would not be 
detected because considerable hybridizing material would be 
remain. Sequence polymorphisms, such as deletions, are present 
in populations of every species and must at some level affect 
phenotype. One of the challenges of the genome era will be to 
critical fy examine sequence polymorphisms that exist in the 
natural gene pool relative to the reference genome sequence. 
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Fig, 2. ORF categories displaying dif- 
ferential expression between heat shocked 
and untreated cultures. -Bars within cate- 
gories correspond to individual ORFs. 
Green shaded bars correspond to relative 
increases in ORF expression under 25°C 
growth conditions. Red shaded bars cor- 
respond to relative increases in ORF ex- 
pression under 39°C growth conditions. 

sion data for thousands of genes. To better understand results for 
genes of known function, ORFs were placed in biologically rele- 
vant categories on the basis of function (e.g., amino acid catabolic 
genes) and/or pathways (e.g., the histidine biosynthesis pathway). 

Table 2. Cold shock vs. control expression data 



Ratio of 
gene expression 



Control 


Cold 


ORF 


Gene 


Description . 




3.3 . 


YOR153 


PDR5 


Pleiotropic drug resistance protein 


2.4 




YCR012 


PGK1 


Phosphoglycerate kinase 


2.9 




YCL040 


GLK1 


Aldohexose specific glucokinase 




1.4 


YHR064 




Heat shock protein 


2.0 




YJL034 


KAR2 


Nuclear fusion protein 


2.1 




YDR258 


HSP78 


Mitochondrial heat shock protein of clpb family of ATP-dependent proteases 


2.2 




YLL039 


UBI4 


Ubiquitin precursor 


2.7 




YLL026 


HSP104 


Heat shock protein 


3.1 




YER103 


SSA4 


Heat shock protein 


3.3 




YBR126 


TPS1 


a, a-Trehalose-phosphate synthase (UDP-forming) 


3.8 




YPL240 


HSP82 


Heat shock protein 


7.9 




YBR054 


YR02 


Similarity to HSP30 heat shock protein Yrolp 


7.9 




YBR072 


HSP26 


Heat shock protein 


16.5 




YCR021 


HSP30 


Heat shock protein 


1.8 




YDR343 


HXT6 


High-affinity hexose transporter 


2.1 




YHR096 


HXT5 


Putative hexose transporter 


2.4 




YFR053 


HXK1 


Hexokinase I 


2.8 




YHR092 


HXT4 


Moderate- to low-affinity glucose transporter 


3.4 




YHR094 


HXT1 


Low-affinity hexose (glucose) transporter 




2.3 


YHR089 


GAR1 


Nucleolar rRNA processing protein 




1.7 


YLR048 


NAB1B 


40S ribosomal protein p40 homolog b 




1.7 


YLR441 


RP10A 


Ribosomal protein S3a.e 




1.7 


YLL045 


RPL4B 


Ribosomal protein L7a.e.B 




1.6 


YLR029 


RPL13A 


Ribosomal protein LlS.e 




1.6 


YGL123 


SUP44 


Ribosomal protein 




3.1 


YBR067 


TIP1 


Cold- and heat-shock-induced protein of the Srpl/Tiplp family 




2.2 


YER011 


T1R1 


Cold-shock-induced protein of the Tirlp, Tiplp family 




2.0 


YCR058 




Hypothetical protein 




4.2 


YKL102 




Hypothetical protein 



Heat Shock 



i<MB Amiiioacid ..... 
motif catabobsm Permease Ammo acid synthesis 



Cvtoskeletat 



synthase Cylochrome-b Cylochrome-c Galactose Protease 



Cell wall Cuaninc 
Copper synthesis Cyclin ON A polymerase Ergosterol Cell cycle exchange 



Glucose metabolism 



OTP binding 



Histidine Lysine Leucine 



Hexose Mitochondrial 
HiMone transport Lipid synthesis Mating type rilmcumal protein Moiosis 



Mitochondrial maintenance melaholism pore 



meUholism pore^ One carbon Purine synthesis PetJte phosphate metabolism 



Protein Kinase 



f'roti'in 

phosphatase Pwteasonw Pre-mRN'A RNA polymerase 



Pyrimidine 

synthesis DNA rpp.ii 



Ribosomal protein 



Spoliation Srpl/Tipl Swi/Snf eEF elf Tv Recombination dehydrogenase 



Secretory Genera! Transcription Factors 



tRNAsvnthelase 



csgssacBisflRBimii 



Ubiquitin Ubiquitin Ubiquitin Vacuolar Vacuolar 
conjugating protease ullier ATPase Vitamin protein He.it shock pmi<*in« 



Gene Expression Analysis. The arrays were used to examine 
gene expression in yeast grown under a variety of different 
conditions. Expression analysis is an ideal application of these 
arrays because a single hybridization provides quantitative expres- 



« 
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Table 3. Glucose vs. galactose expression data 

Ratio of 
gene expression 



Glucose 


Galactose 


ORF 


Gene 


Description 


2.1 




YHR018 


ARG4 


Arginosuccinate lyase 


3.5 




YPR035 


GLN1 


Glutamate-ammonia ligase 


2.8 




YML116 


ATR1 


Aminotriazole and 4-nitroquinoline resistance protein 


2.0 




YMR303 


ADH2 


Alcohol dehydrogenase II 


3.7 




YBR145 


ADH5 


Alcohol dehydrogenase V 




3.2 


YBL030 


AAC2 


ADP, ATP carrier protein 2 




2.9 


YBR085 


AAC3 


ADP, ATP carrier protein 




2.7 


YDR298 


ATP5 


H + -transporting ATP synthase 6 chain precursor 




2.5 


YBR039 


ATP3 


H + -transporting ATP synthase y chain precursor 




5,5 


YML054 


CYB2 


I actate Hphvrlroopna<ie cytochrome 




3.4 


YML054 


CYB2 


Lactate dehydrogenase cytochrome 62 




2.3 


YKL150 


MCR1 


Cytochrome-65 reductase 




4.2 


YBL045 


COR1 


Ubiauinol— cytochrome c reductase 44K core nrotein 




3.5 


YDL067 


COX9 


Cytochrome c oxidase chain VIIA 




2.7 


YLR038 


COX12 


Cytochrome c oxidase, subunit VIB 




2.6 


YHR051 


COX6 


Cytochrome c oxidase subunit VI 




2.4 


YLR395 


COX8 


Cytochrome c oxidase chain VIII 




2.3 


YFR033 


QCR6 


Ubiquinol-cytochrome c reductase 17K protein 




23.7 


YLR081 


GAL2 


Galactose (and glucose) permease 




21.9 


YBR018 


GAL7 


UDP-glucose-hexose- 1 -phosphate uridylyltransferase 




21.8 


YBR020 


GAL1 


Galactokinase 




19.5 


YBR019 


GAL10 


UDP-glucose 4-epimerase 




14.7 


YLR081 


GAL2 


Galactose (and glucose) permease 




8.6 


YDR009 


GAL3 


Galactokinase 




3.0 


YML051 


GAL80(1) 


Negative regulator for expression of galactose-induced genes 




2.8 


YML051 


GAL80(2) 


Negative regulator for expression of galactose-induced genes 


2.7 




YER055 


HIS1 


ATP phosphoribosyl transferase 


3.4 




YBR248 


HIS7 


Glut amine am ido transferase/cyclase 

Phosphoribosyl- A MP cyclohydrolase/phosphoribosyl-ATP pyrophosphatase/histidinol 


7.4 




YCL030 


HIS4 


dehydrogenase 


5.8 




YKR080 


MTD1 


Methylenetetrahydrofolate dehydrogenase (NAD+) 


6.0 




YDR019 


GCV1 


Glycine decarboxylase T subunit 


6.1 




YLR058 


SHM2 


Serine hydroxymethyltransferase 




8.1 


YML123 


PH084 


High-affinity inorganic phosphate/H + symporter 


3.5 




YDR408 


ADE8 


Phosphoribosylglycinamide for myl transferase (GART) 


3.6 




YDR408 


ADE8 


Phosphoribosylglycinamide for myltransf erase (GART) 


4.4 




YAR015 


ADE1 


Phosphoribosylamidoimidazole-succinocarboxamide synthase 


5.6 




YMR300 


ADE4 


Amidophosphoribosyltransferase 


5.6 




YOR128 


ADE2 


Phosphoribosylaminoimidazole carboxylase 


6.0 




YGL234 


ADE5,7 


Phosphoribosylamine-glycine ligase and phosphoribosylformylglycinamidine cyclo-Iigase 




6.3 


YBL015 


ACH1 


Acetyl-CoA hydrolase 



Heat Shock Results. A log phase culture growing in YEP/ 
dextrose medium at 25°C was split in half. One half of the 
culture remained at 25°C whereas the other half of the culture 
was shifted to 39°C. mRNA was isolated from both cultures 1 h 
after heat shock for comparison on microarrays and, although 
this time point is not optimal for measuring induction of heat 
shock mRNAs (17), many known heat shock genes exhibited 
considerable induction at this time point (Table 1; Fig. 2). 
Down-regulation of genes in the ribosomal protein and histone 
gene categories was also observed. Differential expression 
between the heat-shocked culture and the control was also 
observed for many other genes. Genes in many categories, such 
as amino acid catabolism and amino acid synthesis, exhibited 
a mixed response with some genes showing little or no 
differential expression and other genes showing a significant 
increase or decrease in gene expression in response to heat 
shock (Table 1; Fig. 2). 

Cold Shock Results. A log phase culture growing in YEP/ 
dextrose medium at 37°C was split in half. One half of the 
culture remained at 37°C while the other half of the culture was 
shifted to 18°C. mRNA was isolated from both cultures 1 h 
after cold shock for comparison on microarrays. As expected, 



two known cold shock genes (TIP1, TIR1) were expressed at 
a significantly higher level in the cold-shocked culture. Genes 
in other functional categories, such as glucose metabolism and 
heat shock displayed a mixed response with expression of some 
genes being unaffected and other genes exhibiting significant 
up- or down-regulation in response to cold shock (Table 2). 

Steady-State Galactose vs. Glucose Results. mRNA was 
isolated from steady-state log phase YEP galactose and YEP 
glucose grown cultures for comparison on the microarrays. As 
expected, the GAL genes were expressed at a much higher 
level in the galactose culture. Many genes were differentially 
expressed in these cultures that were not a priori expected to 
exhibit differential expression. For example, some genes in the 
amino acid catabolic category were up-regulated in the galac- 
tose culture whereas genes in the one-carbon metabolism and 
purine categories were largely or entirely down-regulated in 
the galactose culture (Table 3). Genes in other categories, such 
as amino acid synthesis, abc transporter, cytochrome c, and 
cytochrome b, exhibited mixed responses; some genes in a 
category showed little or no obvious differential expression 
whereas other genes in the same category showed significant 
differential expression in the galactose and glucose cultures. 
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DISCUSSION 

The results of these experiments show that many genes are 
differentially expressed under the three environmental condi- 
tions described here. The expected and predicted changes in gene 
expression, such as HSP12 in the heat-shocked culture, TIP1 in 
the cold-shocked culture, and GAL2 in the steady-state galactose 
culture, were observed in every case. However, in addition to the 
expected changes in gene expression, significant differential 
expression was also observed for many other genes that would 
not, a priori, be expected to be differentially expressed. For 
example, expression of PHOll decreased and expression of 
YLR194, KIN2, and HXT6 increased in the heat shocked culture. 
Expression of MST1 and APE3 decreased and expression of 
PDR5 and GAR1 increased in the cold-shocked culture. In 
addition, ADE4 and SER2 were expressed at reduced levels 
whereas PH084 and ACH1 were expressed at higher levels in 
cells grown in galactose compared with cells grown in glucose. 
Differential expression of these and many other genes was specific 
to one of these three environmental conditions. 

Many other genes were found to be differentially expressed 
under more than one condition. When differentially expressed 
genes in cold- and heat-shocked cultures were compared, 30 
genes were found in common. Of these 30 genes, 28 showed 
inverse expression (i.e., increased expression under one condition 
and decreased expression under the other condition). Two genes, 
YCR058 and YKL102, showed elevated expression in response to 
both cold and heat shock. Fifteen genes were found to be 
differentially expressed in both the heat-shocked and steady-state 
galactose cultures: 9 genes showed increased expression and 5 
showed decreased expression under both conditions. Twenty 
genes were differentially expressed in both the cold-shocked and 
steady-state galactose cultures: 8 genes showed decreased expres- 
sion and 5 genes showed increased expression under both con- 
ditions. Six genes showed increased expression in the galactose 
culture and decreased expression in the cold shocked culture. 
One gene (ODP1) showed increased expression in both the 
cold-shocked and steady-state galactose cultures. 

Gene expression is affected in a global fashion when environ- 
mental conditions are changed and both expected and unex- 
pected genes are affected. There is also overlap in the genes that 
are differentially expressed under quite different environmental 
conditions. These results can be rationalized by considering the 
high degree of cross-pathway regulation in yeast. For example, 
there is evidence for cross-pathway regulation between (i) carbon 
and nitrogen metabolism (18), («) phosphate and sulfate metab- 
olism (19), and {Hi) purine, phosphate, and amino acid metabo- 
lism (20-24). There are also examples of the interaction of 
general and specific transcription factors (25, 26). Finally, within 
the broad class of amino acid biosynthetic genes, there is evidence 
for amino acid specific regulation of some genes, regulation via 
general control for other genes, and regulation via both specific 
and general control for other genes (22, 27-30). 

Cross-pathway regulation arises from the complex structure 
of promoters. Virtually all promoters contain sites for multiple 
transcription factors and, therefore, virtually all genes are 
subject to combinatorial regulation. For example, the HIS4 
promoter contains binding sites for GCN4 (the general amino 
acid control transcription factor), PH02/BAS2 (a transcrip- 
tional regulator of phosphatase and purine biosynthetic 
genes), and BAS1 (a transcriptional regulator of purine bio- 
synthetic genes) (31). It is likely that the complex effects on 
gene expression described in this work are a direct conse- 
quence of the combinatorial regulation of gene expression. 

These findings illustrate the power of the highly parallel whole 
genome approach when examining gene expression. The global 
effects of environmental change on gene expression can now be 
directly visualized. It is clear that determining the mechanism(s) 
and the functional role of the dramatic global effects on gene 
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expression in different environments will be a significant chal- 
lenge. The era of whole genome analysis will, ultimately, allow 
researchers to switch from the very focused single gene/promoter 
view of gene expression and instead view the cell more as a large 
complex network of gene regulatory pathways. 

With the entire sequence of this model organism known, new 
approaches have been developed that allow for genome wide 
analyses (32, 33) of gene function. The genome microarrays 
represent a novel tool for genetic and expression analysis of the 
yeast genome. This pilot study uses arrays containing >35% of 
the yeast ORFs and it is clear that the entire set of ORFs from 
the yeast genome can be arrayed using the directed primer based 
strategy detailed here. Recent advances in arraying technology 
will allow all 6,100 ORFs to be arrayed in an area of less than 1.8 
cm 2 . Furthermore, as the technology improves, detection limits 
will allow less than 500 ng of starting mRNA material to be used 
for making probe. 

The genome arrays provide for a robust, fully automated 
approach toward examining genome structure and gene func- 
tion. They allow for comparisons between different genomes 
as well as a detailed study of gene expression at the global level. 
This research will help to elucidate relationships between 
genes and allow the researcher to understand gene function by 
understanding expression patterns across the yeast genome. 

Support was provided by National Institutes of Health Grant 
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1. Fleischmann, R. D., Adams, M. D., White, O., Clayton, R. A., Kirkness, E. F., et 
at. (1995) Science 269, 496-512. 

2. Fraser, C. M., Gocaync, J. D., White, O., Adams, M. D., Clayton, R. A., et al. 

(1995) Science 270, 397-403. 

3. Bult, C. J., White, O., Olsen, G. J., Zhou, L., Fleischmann, R. D., et at. (1996) 
Science 273, 1058-1073. 

4. Sulston, J., Du, Z., Thomas, K., Wilson, R., Hillier, L., et al. (1992) Nature 
(London) 356, 37. 

5. Newman, T.,de Bruijn, F. J., Green, P., Keegstra, K.,Kende, H..etal. (1994) Plant 
Physiol. 106, 1241-1255. 

6. Lashkari, D. A., Hunicke-Smith, S. P., Norgren, R. M., Davis, R. W. & Brennan, 
T. (1995) Proc. Natl. Acad. Sci. USA 92, 7912-7915. 

7. Schena, M., Shalon, D., Davis, R. W. & Brown, P. O. (1 995) Science 270, 467-470. 

8. Shalon, D., Smith, S. & Brown, P. O. (1996) Genome Res. 6, 639-645. 

9. Heller, R. A., Schena, M., Chai, A., Shalon, D., Bedilion, T., Gilmore, J., Woolley, 
D. E. & Davis, R. W. (1997) Proc. Natl. Acad. Sci USA 94, 2150-2155. 

10. DeRisi, J., Penland, L., brown, P. O., Bittner, M. L., Meltzer, P. S., Ray, M., Chen, 
Y., Su Ya & Trent, J. M. (1996) Nat. Genet. 14, 457-460. 

1 1. Nelson, S. F., McCusker, J. H., Sander, M. A., Kee, Y., Modrich, P. & Brown P. O. 
(1993) Nat. Genet. 4, 11-18. 

12. Hoffman, C. S. & Winston, F. (1989) Gene 84, 473-479. 

13. Schmitt, M., Brown, T. & Trumpower, B. (1990) Nucleic Acids Res. 18, 3091. 

14. Ehrenhofer-Murray, A. E., Wurgler, F. E. & Sengstag, C. (1994) Mol. Gen. Genet. 
244, 287-294. 

15. Kim, K-W., Kamerud, J. Q., Livingston, D. M. & Roon, R. J. (1988)7. Biol. Chem. 
263, 11948-11953. 

16. Kim, K.-W. & Roon, R. J. (1984)7. Bacteriol. 157, 958-961. 

17. Craig, E. A. (1992) in The Molecular Biology of the Yeast Saccharomyces: Gene 
Expression, eds. Jones, E. W., Pringle, J. R. & Broach, J. R. (Cold Spring Harbor 
Lab. Press, Plainview, NY), Vol. 2, pp. 501-537. 

18. Dang, V. D., Bohn, C, Bolotin-Fukuhara, M. & Daignan-Fornier, B. (1996)/ 
Bacteriol. 178, 1842-1849. 

19. O'Connell, K. F. & Baker, R. E. (1992) Genetics 132, 63-73. 

20. Braus, G., Mosch, H. U, Vogel, K.. Hinnen, A. & Hulter, R. (1989) EM BO J. 8, 
939-945. 

21. Mosch, H. U., Scheier, B„ Lahti, R., Mantsala, P. & Braus, G. H. (1991)7. Biol. 
Chem. 266, 20453-20456. 

22. Mitchell, A. P. & Magasanik, B. (1984) Mol. Cell. Biol. 4, 2767-2773. 

23. Daignan-Fornier, B. & Fink, G. R. (1992) Proc. Natl. Acad. Sci USA 89, 
6746-6750. 

24. Tice-Baldwin, K., Fink, G. R. & Arndt, K. T. (1989) Science 246, 931-935. 

25. Messcnguy, F. & Dubois, E. (1993) Mol. Cell. Biol. 13, 2586-2592. 

26. Devlin, C, Tice-Baldwin, K., Shore, D. & Arndt, K. T. (1991) Mol. Cell. Biol. 11, 
3642-3651. 

27. Magasan ik, B. ( 1 992) in 77ie Molecular and Cellular Biology of the Yeast Saccha- 
romyces: Gene Expression, eds. Jones, E. W., Pringle, J. R. & Broach, J. R. (Cold 
Spring Harbor Lab. Press, Plainview, NY), Vol. 2, pp. 283-317. 

28. Hinnebusch, A. G. (1992) in The Molecular and Cellular Biology of the Yeast 
Saccharomyces: Gene Expression, eds. Jones, E. W., Pringle, J. R. & Broach, J. R. 
(Cold Spring Harbor Lab. Press, Plainview, NY), Vol. 2, pp. 319-414. 

29. Brisco, P. R. & Kohlhaw, G. B. (1990)7. Biol. Chem. 265, 11667-11675, 

30. O'Connell, K. F., Surdin-Kerjan, Y. & Baker R. E. (1995) Mol. Cell. Biol. 15, 
1879-1888. 

31. Arndt K. T., Styles, C. & Fink, G. R. (1987) Science 237, 874-880. 

32. Smith, V., Chou, K. N., Lashkari, D., Botstein, D. & Brown, P. O. (1996) Science 
274, 2069-2074. 

33. Shoemaker, D, D., Lashkari, D. A., Morris. D., Mittman, M. & Davis, R. W. 

(1996) Nat. Genet. 14, 450-456. 



Exhibit G of Rockett Declaration 
with Response dated 03/18/04 
InUSSN: 10/031,904 



Fiscner-Vize, Science 270, 1828 (1995). 

35. T. C. James and S. C. Elgin, Moi. Cell Biol. 6, 3862 
(1986); R. Paro and D. S. Hogness, Proc. Natl. Acad. 
Sci. U.S.A. 88, 263 (1991); B. Tschiersch et a/., 
EM BO J. 1 3, 3822 (1994); M. T. Madireddi et a!., Cell 
87, 75 (1996); D. G. Stokes, K. D. Tartof, R. P. Perry, 
Proc. Natl. Acad. Sci. USA. 93, 7137 (1996). 

36. P. M. Palosaari et a/., J. Biol. Chem. 266, 10750 
(1991); A. Schmitz, K. H. Gartemann, J. Fiedler, E. 



Grund, R. Bchenlaub, Appl. Environ. Microbiol. 58, 
4068 (1992); V. Sharma, K. Suvama, R. Mega- 
nathan, M. E. Hudspeth. J. Bacterid. 174, 5057 

(1992) ; M. Kanazawa et aJ., Enzyme Protein 47, 9 

(1993) ; Z. L Boynton, G. N. Bennet. F. B. Rudolph, 
J. Bacteriol. 178, 3015 (1996). 

37. M. Ho et a/., Cell 77. 869 (1994). 

38. W. Hendriks et a!., J. Cell Biochem. 59, 418 (1995). 

39. We thank H. Skaletsky and F. Lewitter for help with 



sequence analysis; Lawrence Livermore National 
Laboratory for the flow-sorted Y cosmid library; and 
P. Bain, A. Bortvin, A. de la Chapelle, G. Fink, K. 
Jegalian, T. Kawaguchi, E. Lander, H. Lodish, P. 
Matsudaira, D. Menke, U. RajBhandary, R. Reijo, S. 
Rozen, A. Schwartz, C. Sun, and C. Titford for com- 
ments on the manuscript. Supported by NIK 

28 April 1997; accepted 9 September 1997 



Exploring the Metabolic and Genetic Control of 
Gene Expression on a Genomic Scale 

Joseph L DeRisi, Vishwanath R. Iyer, Patrick O. Brown* 

DNA microarrays containing virtually every gene of Saccharomyces cerevisiae were used 
to carry out a comprehensive investigation of the temporal program of gene expression 
accompanying the metabolic shift from fermentation to respiration. The expression 
profiles observed for genes with known metabolic functions pointed to features of the 
metabolic reprogramming that occur during the diauxic shift, and the expression patterns 
of many previously uncharacterized genes provided clues to their possible functions. The 
same DNA microarrays were also used to identify genes whose expression was affected 
by deletion of the transcriptional co-repressor TUP1 or overexpression of the transcrip- 
tional activator YAP1. These results demonstrate the feasibility and utility of this ap- 
proach to genomewide exploration of gene expression patterns. 



Xhe complete sequences of nearly a dozen 
microbial genomes are known, and in the 
next several years we expect to know the 
complete genome sequences of several 
metazoans, including the human genome. 
Defining the role of each gene in these 
genomes will be a formidable task, and un- 
derstanding how the genome functions as a 
whole in the complex natural history of a 
living organism presents an even greater 
challenge. 

Knowing when and where a gene is 
expressed often provides a strong clue as to 
its biological role. Conversely, the pattern 
of genes expressed in a cell can provide 
detailed information about its state. Al- 
though regulation of protein abundance in 
a cell is by no means accomplished solely 
by regulation of mRNA, virtually all dif- 
ferences in cell type or state are correlated 
with changes in the mRNA levels of many 
genes. This is fortuitous because the only 
specific reagent required to measure the 
abundance of the mRNA for a specific 
gene is a cDNA sequence. DNA microar- 
rays, consisting of thousands of individual 
gene sequences printed in a high-density 
array on a glass microscope slide (I, 2), 
provide a practical and economical tool 
for studying gene expression on a very 
large scale (3-6). 

Saccharomyces cerevisiae is an especially 
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favorable organism in which to conduct a 
systematic investigation of gene expression. 
The genes are easy to recognize in the ge- 
nome sequence, cis regulatory elements are 
generally compact and close to the tran- 
scription units, much is already known 
about its genetic regulatory mechanisms, 
and a powerful set of tools is available for its 
analysis. 

A recurring cycle in the natural history 
of yeast involves a shift from anaerobic 
(fermentation) to aerobic (respiration) me- 
tabolism. Inoculation of yeast into a medi- 
um rich in sugar is followed by rapid growth 
fueled by fermentation, with the production 
of ethanol. When the fermentable sugar is 
exhausted, the yeast cells turn to ethanol as 
a carbon source for aerobic growth. This 
switch from anaerobic growth to aerobic 
respiration upon depletion of glucose, re- 
ferred to as the diauxic shift, is correlated 
with widespread changes in the expression 
of genes involved in fundamental cellular 
processes such as carbon metabolism, pro- 
tein synthesis, and carbohydrate storage 
(7). We used DNA microarrays to charac- 
terize the changes in gene expression that 
take place during this process for nearly the 
entire genome, and to investigate the ge- 
netic circuitry that regulates and executes 
this program. 

Yeast open reading frames (ORFs) were 
amplified by the polymerase chain reaction 
(PCR), with a commercially available set of 
primer pairs (8). DNA microarrays, con- 
taining approximately 6400 distinct DNA 
sequences, were printed onto glass slides by 



using a simple robotic printing device (9). 
Cells from an exponentially growing culture 
of yeast were inoculated into fresh medium 
and grown at 30°C for 21 hours. After an 
initial 9 hours of growth, samples were har- 
vested at seven successive 2-hour intervals, 
and mRNA was isolated (10). Fluorescently 
labeled cDN A was prepared by reverse tran- 
scription in the presence of Cy3(green)- 
or Cy5(red)-labeled deoxyuridine triphos- 
phate (dUTP) (11) and then hybridized to 
the microarrays (12}. To maximize the re- 
liability with which changes in expression 
levels could be discerned, we labeled cDN A 
prepared from cells at each successive time 
point with Cy5, then mixed it with a Cy3- 
labeled "reference" cDNA sample prepared 
from cells harvested at the first interval 
after inoculation. In this experimental de- 
sign, the relative fluorescence intensity 
measured for the Cy3 and Cy5 fluors at 
each array element provides a reliable mea- 
sure of the relative abundance of the corre- 
sponding mRNA in the two cell popula- 
tions (Fig. 1). Data from the series of seven 
samples (Fig. 2), consisting of more than 
43,000 expression- ratio measurements, 
were organized into a database to facilitate 
efficient exploration and analysis of the 
results. This database is publicly available 
on the Internet (13). 

During exponential growth in glucose- 
rich medium, the global pattern of gene 
expression was remarkably stable. Indeed, 
when gene expression patterns between the 
first two cell samples (harvested at a 2-hour 
interval) were compared, mRNA levels dif- 
fered by a factor of 2 or more for only 19 
genes (0.3%), and the largest of these dif- 
ferences was only 2.7-fold (14). However, as 
glucose was progressively depleted from the 
growth media during the course of the ex- 
periment, a marked change was seen in the 
global pattern of gene expression. mRNA 
levels for approximately 710 genes were 
induced by a factor of at least 2, and the 
mRNA levels for approximately 1030 genes 
declined by a factor of at least 2. Messenger 
RNA levels for 183 genes increased by a 
factor of at least 4, and mRNA levels for 
203 genes diminished by a factor of at least 
4- About half of these differentially ex- 
pressed genes have no currently recognized 
function and are not yet named. Indeed, 
more than 400 of the differentially ex- 
pressed genes have no apparent homology 
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to any gene whose function is known (15). 
The responses of these previously unchar- 
acterized genes to the diauxic shift therefore 
provides the first small clue to their possible 
roles. 

The global view of changes in expres- 
sion of genes with known functions pro- 
vides a vivid picture of the way in which 
the cell adapts to a changing environ- 
ment. Figure 3 shows a portion of the yeast 
metabolic pathways involved in carbon 
and energy metabolism. Mapping the 
changes we observed in the mRNAs en- 
coding each enzyme onto this framework 
allowed us to infer the redirection in the 
flow of metabolites through this system. 
We observed large inductions of the genes 
coding for the enzymes aldehyde dehydro- 
genase (ALD2) and acetyl-coenzyme 
A(CoA) synthase (ACSJ), which func- 
tion together to convert the products of 
alcohol dehydrogenase into acetyl-CoA, 
which in turn is used to fuel the tricarbox- 
ylic acid (TCA) cycle and the glyoxylate 
cycle. The concomitant shutdown of tran- 
scription of the genes encoding pyruvate 
decarboxylase and induction of pyruvate 
carboxylase rechannels pyruvate away 
from acetaldehyde, and instead to oxalac- 
etate, where it can serve to supply the 
TCA cycle and gluconeogenesis. Induc- 
tion of the pivotal genes PCKl, encoding 
phosphoenolpyruvate carboxykinase, and 
FBPJ, encoding fructose 1,6-biphos- 
phatase, switches the directions of two key 
irreversible steps in glycolysis, reversing 
the flow of metabolites along the revers- 
ible steps of the glycolytic pathway toward 
the essential biosynthetic precursor, glu- 
coses-phosphate. Induction of the genes 
coding for the trehalose synthase and gly- 
cogen synthase complexes promotes chan- 
neling of glucose-6-phosphate into these 
carbohydrate storage pathways. 

Just as the changes in expression of 
genes encoding pivotal enzymes can pro- 
vide insight into metabolic reprogram- 
ming, the behavior of large groups of func- 
tionally related genes can provide a broad 
view of the systematic way in which the 
yeast cell adapts to a changing environ- 
ment (Fig. 4). Several classes of genes, 
such as cytochrome c-related genes and 
those involved in the TCA/glyoxylate cy- 
cle and carbohydrate storage, were coordi- 
nate^ induced by glucose exhaustion. In 
contrast, genes devoted to protein synthe- 
sis, including ribosomal proteins, tRNA 
synthetases, and translation, elongation, 
and initiation factors, exhibited a coordi- 
nated decrease in expression. More than 
95% of ribosomal genes showed at least 
twofold decreases in expression during the 
diauxic shift (Fig. 4) (13). A noteworthy 
and illuminating exception was that the 



genes encoding mitochondrial ribosomal 
genes were generally induced rather than 
repressed after glucose limitation, high- 
lighting the requirement for mitchondrial 
biogenesis (13). As more is learned about 
the functions of every gene in the yeast 
genome, the ability to gain insight into a 
cell's response to a changing environment 
through its global gene expression patterns 
will become increasingly powerful. 

Several distinct temporal patterns of ex- 
pression could be recognized, and sets of 
genes could be grouped on the basis of the 
similarities in their expression patterns. The 
characterized members of each of these 
groups also shared important similarities in 
their functions. Moreover, in most cases, 
common regulatory mechanisms could be 
inferred for sets of genes with similar expres- 
sion profiles. For example, seven genes 
showed a late induction profile, with mRNA 
levels increasing by more than ninefold at 
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the last timepoint but less than threefold at 
the preceding timepoint (Fig. 5B). All of 
these genes were known to be glucose-re- 
pressed, and five of the seven were previously 
noted to share a common upstream activat- 
ing sequence (UAS), the carbon source re- 
sponse element (CSRE) (16-20). A search 
in the promoter regions of the remaining two 
genes, ACRJ and IDP2, revealed that 
ACR1, a gene essential for ACS1 activity, 
also possessed a consensus CSRE motif, but 
interestingly, 1DP2 did not. A search of the 
entire yeast genome sequence for the con- 
sensus CSRE motif revealed only four addi- 
tional candidate genes, none of which 
showed a similar induction. 

Examples from additional groups of 
genes that shared expression profiles are 
illustrated in Fig. 5, C through F. The 
sequences upstream of the named genes in 
Fig. 5C all contain stress response ele- 
ments (STRE), and with the exception 
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Fig. 1. Yeast genome microarray. The actual size of the microarray is 18 mm by 18 mm. The 
microarray was printed as described (9). This image was obtained with the same fluorescent 
scanning confocal microscope used to collect all the data we report (49). A fluorescently labeled 
cDNA probe was prepared from mRNA isolated from cells harvested shortly after inoculation (culture 
density of <5 x 10 6 cells/ml and media glucose level of 19 g/liter) by reverse transcription in the 
presence of Cy3-dUTP. Similarly, a second probe was prepared from mRNA isolated from cells taken 
from the same culture 9.5 hours later (culture density of ~2 x 10 8 cells/ml, with a glucose level of 
<0.2 g/liter) by reverse transcription in the presence of Cy5-dUTP. In this image, hybridization of the 
Cy3-dUTP-labeled cDNA (that is, mRNA expression at the initial timepoint) is represented as a green 
signal, and hybridization of Cy5-dUTP-labeled cDNA (that is, mRNA expression at 9.5 hours) is 
represented as a red signal. Thus, genes induced or repressed after the diauxic shift appear in this 
image as red and green spots, respectively. Genes expressed at roughly equal levels before and after 
the diauxic shift appear in this image as yellow spots. 
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of HSP42, have previously been shown to 
be controlled at least in part by these 
elements (21-24). Inspection of the se- 
quences upstream of HSP42 and the two 
uncharacterized genes shown in Fig. 5C, 
YKL026c, a hypothetical protein with 
similarity to glutathione peroxidase, and 
YGR043c, a putative transaldolase, re- 
vealed that each of these genes also pos- 
sess repeated upstream copies of the stress- 
responsive CCCCT motif. Of the 13 ad- 
ditional genes in the yeast genome that 
shared this expression profile [including 
HSP30, ALD2, OM45, and 10 uncharac- 
terized ORFs (25)], nine contained one or 
more recognizable STRE sites in their up- 
stream regions. 

The heterotrimeric transcriptional acti- 
vator complex HAP2,3 t 4 has been shown 
to be responsible for induction of several 
genes important for respiration (26-28). 
This complex binds a degenerate consensus 
sequence known as the CCAAT box (26). 
Computer analysis, using the consensus se- 
quence TNRYTGGB (29), has suggested 
that a large number of genes involved in 
respiration may be specific targets of 
HAP2,3,4 (30). Indeed, a putative 
HAP2 t 3 > 4 binding site could be found in 
the sequences upstream of each of the seven 
cytochrome c-related genes that showed 
the greatest magnitude of induction (Fig. 
5D). Of 12 additional cytochrome c-related 
genes that were induced, HAP2,3,4 binding 
sites were present in all but one. Signifi- 
cantly, we found that transcription of 
HAP4 itself was induced nearly ninefold 
concomitant with the diauxic shift. 

Control of ribosomal protein biogenesis 
is mainly exerted at the transcriptional 
level, through the presence of a common 
upstream-activating element (UAS ) 
that is recognized by the Rapl DNA-bind- 
ing protein (31, 32). The expression pro- 
files of seven ribosomal proteins are shown 
in Fig. 5F. A search of the sequences 
upstream of all seven genes revealed con- 
sensus Rapl-binding motifs (33). It has 
been suggested that declining Rapl levels 
in the cell during starvation may be re- 
sponsible for the decline in ribosomal pro- 
tein gene expression (34). Indeed, we ob- 
served that the abundance of RAPl 
mRNA diminished by 4-4-fold, at about 
the time of glucose exhaustion. 

Of the 149 genes that encode known or 
putative transcription factors, only two, 
HAP4 and SIP4, were induced by a factor of 
more than threefold at the diauxic shift. 
S/P4 encodes a DNA-binding transcrip- 
tional activator that has been shown to 
interact with Snfl , the "master regulator" of 
glucose repression (35). The eightfold in- 
duction of S/P4 upon depletion of glucose 
strongly suggests a role in the induction of 



downstream genes at the diauxic shift. 

Although most of the transcriptional 
responses that we observed were not pre- 
viously known, the responses of many 
genes during the diauxic shift have been 
described. Comparison of the results we 
obtained by DNA microarray hybridiza- 
tion with previously reported results there- 
fore provided a strong test of the sensitiv- 
ity and accuracy of this approach. The 
expression patterns we observed for previ- 
ously characterized genes showed almost 
perfect concordance with previously pub- 
lished results (36). Moreover, the differ- 
ential expression measurements obtained 
by DNA microarray hybridization were re- 
producible in duplicate experiments. For 
example, the remarkable changes in gene 
expression between cells harvested imme- 
diately after inoculation and immediately 
after the diauxic shift (the first and sixth 
intervals in this time series) were mea- 
sured in duplicate, independent DNA mi- 
croarray hybridizations. The correlation 
coefficient for two complete sets of expres- 
sion ratio measurements was 0.87, and for 
more than 95% of the genes, the expres- 



sion ratios measured in these duplicate 
experiments differed by less than a factor 
of 2. However, in a few cases, there were 
discrepancies between our results and pre- 
vious results, pointing to technical limita- 
tions that will need to be addressed as 
DNA microarray technology advances 
(37, 38). Despite the noted exceptions, 
the high concordance between the results 
we obtained in these experiments and 
those of previous studies provides confi- 
dence in the reliability and thoroughness 
of the survey. 

The changes in gene expression during 
this diauxic shift are complex and involve 
integration of many kinds of information 
about the nutritional and metabolic state 
of the cell. The large number of genes 
whose expression is altered and the diver- 
sity of temporal expression profiles ob- 
served in this experiment highlight the 
challenge of understanding the underlying 
regulatory mechanisms. One approach to 
defining the contributions of individual 
regulatory genes to a complex program of 
this kind is to use DNA microarrays to 
identify genes whose expression is affected 



Fig. 2. The section of the ar- 
ray indicated by the gray box 
in Rg. 1 is shown for each of 
the experiments described 
here. Representative genes 
are labeled. In each of the ar- 
rays used to analyze gene 
expression during the diauxic 
shift, red spots represent 
genes that were induced rel- 
ative to the initial timepoint, 
and green spots represent 
genes that were repressed 
relative to the initial timepoint. 
tn the arrays used to analyze 
the effects of the tuplb mu- 
tation and YAP1 overexpres- 
sion, red spots represent 
genes whose expression was 
increased, and green spots 
represent genes whose ex- 
pression was decreased by 
the genetic modification. Note 
that distinct sets of genes are 
induced and repressed in the 
different experiments. The 
complete images of each of 
these arrays can be viewed on 
the Internet (13). Cell density 
as measured by optical densi- 
ty (OD) at 600 nm was used to 
measure the growth of the 
culture. 
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by mutations in each putative regulatory 
gene. As a test of this strategy, we analyzed 
the genomewide changes in gene expression 
that result from deletion of the TUP I gene. 
Transcriptional repression of many genes by 
glucose requires the DNA-binding repressor 



Migl and is mediated by recruiting the tran- 
scriptional co-repressors Tupl and Cyc8/ 
Ssn6 (39). Tupl has also been implicated in 
repression of oxygen-regulated, mating-type- 
specific, and DNA-damage-inducible genes 
(40). 



^B ranchin g 



Debranching 




Fig. 3. Metabolic reprogramming inferred from global analysis of changes in gene expression. Only key 
metabolic intermediates are identified. The yeast genes encoding the enzymes that catalyze each step 
in this metabolic circuit are identified by name in the boxes. The genes encoding succinyl-CoA synthase 
and glycogen-debranching enzyme have not been explicitly identified, but the ORFs YGR244 and 
YPR184 show significant homology to known succinyl-CoA synthase and gtycogen-debranching en- 
zymes, respectively, and are therefore included in the corresponding steps in this figure. Red boxes with 
white lettering identify genes whose expression increases in the diauxic shift. Green boxes with dark 
green lettering identify genes whose expression diminishes in the diauxic shift. The magnitude of 
induction or repression is indicated for these genes. For multimeric enzyme complexes, such as 
succinate dehydrogenase, the indicated fold-induction represents an unweighted average of all the 
genes listed in the box. Black and white boxes indicate no significant differential expression (less than 
twofold). The direction of the arrows connecting reversible enzymatic steps indicate the direction of the 
flow of metabolic intermediates, inferred from the gene expression pattern, after the diauxic shift. Arrows 
representing steps catalyzed by genes whose expression was strongly induced are highlighted in red. 
The broad gray arrows represent major increases in the flow of metabolites after the diauxic shift, 
inferred from the indicated changes in gene expression. 



Wild-type yeast cells and cells bearing 
a deletion of the TUPl gene (tupl&) were 
grown in parallel cultures in rich medium 
containing glucose as the carbon source. 
Messenger RNA was isolated from expo- 
nentially growing cells from the two pop- 
ulations and used to prepare cDNA la- 
beled with Cy3 (green) and Cy5 (red), 
respectively ( J / ). The labeled probes were 
mixed and simultaneously hybridized to 
the microarray. Red spots on the microar- 
ray therefore represented genes whose 
transcription was induced in the tup] A 
strain, and thus presumably repressed by 
Tupl (4i ). A representative section of the 
microarray (Fig. 2, bottom middle panel) 
illustrates that the genes whose expression 
was affected by the tupl A mutation, were, 
in general, distinct from those induced 
upon glucose exhaustion (complete images 
of all the arrays shown in Fig. 2 are avail- 
able on the Internet (13)]. Nevertheless, 
34 (10%) of the genes that were induced 
by a factor of at least 2 after the diauxic 
shift were similarly induced by deletion of 
TUPl , suggesting that these genes may be 
subject to TUP J -mediated repression by 
glucose. For example, SUC2, the gene en- 
coding invertase, and all five hexose trans- 
porter genes that were induced during the 
course of the diauxic shift were similarly 
induced, in duplicate experiments, by the 
deletion of TUPL 

The set of genes affected by Tupl in this 
experiment also included a-glucosidases, 
the mating-type-specific genes MFAJ and 
MFA2, and the DNA damage-inducible 
RNR2 and RNR4, as well as genes involved 
in flocculation and many genes of unknown 
function. The hybridization signal corre- 
sponding to expression of TUP I itself was 
also severely reduced because of the (in- 
complete) deletion of the transcription unit 
in the tupl A strain, providing a positive 
control in the experiment (42). 

Many of the transcriptional targets of 
Tupl fell into sets of genes with related 
biochemical functions. For instance, al- 
though only about 3% of all yeast genes 
appeared to be TUPl -repressed by a factor 
of more than 2 in duplicate experiments 
under these conditions, 6 of the 13 genes 
that have been implicated in flocculation 
(15) showed a reproducible increase in 
expression of at least twofold when TUPl 
was deleted. Another group of related 
genes that appeared to be subject to TUPl 
repression encodes the serine-rich cell 
wall mannoproteins, such as Tipl and 
Tirl/Srpl which are induced by cold 
shock and other stresses (43), and similar, 
serine-poor proteins, the seripauperins 
(44). Messenger RNA levels for 23 of the 
26 genes in this group were reproducibly 
elevated by at least 2.5-fold in the tuplk 
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strain, and 18 of these genes were induced 
by more than sevenfold when TUP J was 
deleted. In contrast, none of 83 genes that 
could be classified as putative regulators of 
the cell division cycle were induced more 
than twofold by deletion of TUPl. Thus, 
despite the diversity of the regulatory sys- 
tems that employ Tupl, most of the genes 
that it regulates under these conditions 
fall into a limited number of distinct func- 
tional classes. 

Because the microarray allows us to 
monitor expression of nearly every gene in 
yeast, we can, in principle, use this ap- 
proach to identify all the transcriptional 
targets of a regulatory protein like Tupl. It 
is important to note, however, that in any 
single experiment of this kind we can only 
recognize those target genes that are nor- 
mally repressed (or induced) under the 
conditions of the experiment. For in- 
stance, the experiment described here an- 
alyzed a MAT a strain in which MFAi 
and MFA2, the genes encoding the a- 
factor mating pheromone precursor, are 
normally repressed. In the isogenic tupl A 
strain, these genes were inappropriately 
expressed, reflecting the role that Tupl 
plays in their repression. Had we instead 
carried out this experiment with a MATA 
strain (in which expression of MFAI and 
MFA2 is not repressed), it would not have 
been possible to conclude anything re- 
garding the role of Tupl in the repression 
of these genes. Conversely, we cannot dis- 
tinguish indirect effects of the chronic 
absence of Tupl in the mutant strain from 
effects directly attributable to its partici- 
pation in repressing the transcription of a 
gene. 

Another simple route to modulating the 
activity of a regulatory factor is to overex- 
press the gene that encodes it. YAP J en- 
codes a DNA-binding transcription factor 
belonging to the b-zip class of DNA-bind- 
ing proteins. Overexpression of YAPl in 
yeast confers increased resistance to hydro- 
gen peroxide, o-phenanthroline, heavy 
metals, and osmotic stress (45). We ana- 
lyzed differential gene expression between a 
wild-type strain bearing a control plasmid 
and a strain with a plasmid expressing YAP! 
under the control of the strong GALl-10 
promoter, both grown in galactose (that is, 
a condition that induces YAPJ overexpres- 
sion). Complementary DNA from the con- 
trol and YAPJ overexpressing strains, la- 
beled with Cy3 and Cy5, respectively, was 
prepared from mRNA isolated from the two 
strains and hybridized to the microarray. 
Thus, red spots on the array represent genes 
that were induced in the strain overexpress- 
ing YAPJ. 

Of the 17 genes whose mRNA levels 
increased by more than threefold when 



YAPJ was overexpressed in this way, five 
bear homology to aryl-alcohol oxidoreduc- 
tases (Fig. 2 and Table 1). An additional 
four of the genes in this set also belong to 
the general class of dehydrogenases/oxi- 
doreductases. Very little is known about 
the role of aryl-alcohol oxidoreductases in 
S. cerevisiae, but these enzymes have been 
isolated from ligninolytic fungi, in which 
they participate in coupled redox reac- 
tions, oxidizing aromatic, and aliphatic 
unsaturated alcohols to aldehydes with the 
production of hydrogen peroxide (46, 47). 
The fact that a remarkable fraction of the 
targets identified in this experiment be- 
long to the same small, functional group of 
oxidoreductases suggests that these genes 

Fig. 4. Coordinated reg- 
ulation of functionally re- 
lated genes. The curves 
represent the average in- 
duction or repression ra- 
tios for all the genes in 
each indicated group. 
The total number of 
genes in each group was 
as follows: ribosomal 
proteins, 112; translation 
elongation and initiation 

factors, 25; tRNA synthetases (excluding mitochondial synthetases), 17; glycogen and trehalose syn- 
thesis and degradation, 15; cytochrome c oxidase and reductase proteins, 19; and TCA- and glyoxy- 
late-cycle enzymes, 24. 

Table 1 . Genes induced by YAP1 overexpression. This list includes all the genes for which mRNA levels 
increased by more than twofold upon YAP1 overexpression in both of two duplicate experiments, and 
for which the average increase in mRNA level in the two experiments was greater than threefold (50). 
Positions of the canonical Yap1 binding sites upstream of the start codon, when present, and the 
average fold-increase in mRNA levels measured in the two experiments are indicated. 



might play an important protective role 
during oxidative stress. Transcription of a 
small number of genes was reduced in the 
strain overexpressing Yapl. Interestingly, 
many of these genes encode sugar per- 
meases or enzymes involved in inositol 
metabolism. 

We searched for Yapl-binding sites 
(TTACTAA or TGACTAA) in the se- 
quences upstream of the target genes we 
identified (48). About two-thirds of the 
genes that were induced by more than 
threefold upon Yapl overexpression had 
one or more binding sites within 600 bases 
upstream of the start codon (Table 1), sug- 
gesting that they are directly regulated by 
Yapl. The absence of canonical Yapl-bind- 
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ing sites upstream of the others may reflect 
an ability of Yapl to bind sites that differ 
from the canonical binding sites, perhaps in 
cooperation with other factors, or less like- 
ly, may represent an indirect effect of Yapl 
overexpression, mediated by one or more 
intermediary factors. Yapl sites were found 
only four times in the corresponding region 
of an arbitrary set of 30 genes that were not 
differentially regulated by Yapl. 

Use of a DNA microarray to character- 
ize the transcriptional consequences of 
mutations affecting the activity of regula- 
tory molecules provides a simple and pow- 
erful approach to dissection and character- 
ization of regulatory pathways and net- 



works. This strategy also has an important 
practical application in drug screening. 
Mutations in specific genes encoding can- 
didate drug targets can serve as surrogates 
for the ideal chemical inhibitor or modu- 
lator of their activity. DNA microarrays 
can be used to define the resulting signa- 
ture pattern of alterations in gene expres- 
sion, and then subsequently used in an 
assay to screen for compounds that repro- 
duce the desired signature pattern. 

DNA microarrays provide a simple and 
economical way to explore gene expres- 
sion patterns on a genomic scale. The 
hurdles to extending this approach to any 
other organism are minor. The equipment 



required for fabricating and using DNA 
microarrays (9) consists of components 
that were chosen for their modest cost and 
simplicity. It was feasible for a small group 
to accomplish the amplification of more 
than 6000 genes in about 4 months and, 
once the amplified gene sequences were in 
hand, only 2 days were required to print a 
set of 110 microarrays of 6400 elements 
each. Probe preparation, hybridization, 
and fluorescent imaging are also simple 
procedures. Even conceptually simple ex- 
periments, as we described here, can yield 
vast amounts of information. The value of 
the information from each experiment of 
this kind will progressively increase as 
more is learned about the functions of 
each gene and as additional experiments 
define the global changes in gene expres- 
sion in diverse other natural processes and 
genetic perturbations. Perhaps the greatest 
challenge now is to develop efficient 
methods for organizing, distributing, inter- 
preting, and extracting insights from the 
large volumes of data these experiments 
will provide. 
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Fig. 5. Distinct temporal patterns of induction or repression help to group genes that share regulatory 
properties. (A) Temporal profile of the cell density, as measured by OD at 600 nm and glucose 
concentration in the media. (B) Seven genes exhibited a strong induction (greater than ninefold) only at 
the last timepoint (20.5 hours). With the exception of IDP2 % each of these genes has a CSRE UAS. There 
were no additional genes observed to match this profile. (C) Seven members of a class of genes marked 
by early induction with a peak in mRNA levels at 18.5 hours. Each of these genes contain STRE motif 
repeats in their upstream promoter regions. (D) Cytochrome c oxidase and ubiquinol cytochrome c 
reductase genes. Marked by an induction coincident with the diauxic shift, each of these genes contains 
a consensus binding motif for the HAP2.3.4 protein complex. At least 17 genes shared a similar 
expression profile. (E) SAMh GPP1, and several genes of unknown function are repressed before the 
diauxic shift, and continue to be repressed upon entry into stationary phase. (F) Ribosomal protein 
genes comprise a large class of genes that are repressed upon depletion of glucose. Each of the genes 
profiled here contains one or more RAP1 -binding motifs upstream of its promoter. RAP1 is a transcrip- 
tional regulator of most ribosomal proteins. 
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Human Genome Placed on Chip; Biotech Rivals Put It Up 
for Sale 

By ANDREW POLLACK (NYT) 1030 words 
The genome on a chip has arrived. 

Melding high technology with biology, several companies are rushing to sell slivers of glass or 
nylon, some as small as postage stamps, packed with pieces of all 30,000 or so known human 
genes. 

The new products will allow scientists to scan all genes in a human tissue sample at once, to 
determine which genes are active, a job that previously required two or more chips. The whole- 
genome chips will lower the cost and increase the speed of a widely used test that has 
transformed biomedical research in the last few years. 

"It's sort of a milestone event, very similar to generating an integrated circuit of the genome," 
said Stephen P. A. Fodor, the chief executive of Affymetrix Inc., the leading seller of gene chips, 
which are also called microarrays. 

Affymetrix, based in Santa Clara, Calif, is expected to announce today that it is accepting orders 
for its whole-genome chip. 

The announcement seems timed to steal some thunder from the rival Agilent Technologies, 
which is based in nearby Palo Alto. Agilent is to be the host of an analyst meeting today and it 
plans to announce then that it has started shipping test versions of its whole-genome chip. 

Applied Biosystems of Foster City, Calif, a unit of the Applera Corporation, started the race in 
July with an announcement that it would have a whole-genome chip out by the end of this year. 
NimbleGen Systems, a small company in Madison, Wis., announced a few days later that it had a 
genome on a chip that it was not selling but that it was using to run tests for customers. 

Gene chips, which detect genes that are active, meaning they are being used to make a protein, 
have become essential tools. Scientists try to understand the genetic mechanisms of disease by 
seeing which genes are turned on in, say, a sick kidney or lung compared with those active in a 
healthy organ. Pharmaceutical companies look at gene activity patterns to try to predict the 
effects of drugs. 



Scientists have found that tumors that look the same under the microscope can differ in terms of 
which genes are active. So studying gene patterns could become a way to discriminate between 
deadly and not-so-deadly tumors, or to predict which drug will work best for a particular patient 

Still, even some vendors conceded that the change from two chips to one is more symbolic than 
revolutionary. 

"You can do just as good science with two chips, it costs you a little more," said Roland Green, 
the vice president for research and development at NimbleGen. 

Some scientists questioned whether the chips really have all human genes, because the exact 
number and identities of all the genes is not known. 

The advent of the genome on a chip is, however, evidence that biotechnology, to the extent that it 
uses electronics, is experiencing some of the rapid progress that has made semiconductors and 
computers continuously cheaper and smaller. 

,! One of the effects everyone is looking for in the genomics area is Moore's law - more data, less 
money," said Doug Dolginow, an executive vice president at Gene Logic, which sells data from 
gene chip studies to pharmaceutical companies. "This is a step in that direction." 

Moore's law states that the number of transistors on a semiconductor chip doubles every 18 
months. 

Affymetrix's gene chips are, in fact, made with the same techniques used to make semiconductor 
chips. In the mid-1990 , s, the company came out with a set of five chips covering what was then 
known of the human genome. After the human genome sequence was virtually completed in 
2000, the company developed a two-chip set with all the known genes. Now it has the single 
chip, which some scientists say will be more convenient. 

"We like to be able to look at all genes at one time to get a global view of what's going on," said 
John R. Walker, who runs gene chip operations at the Genomics Institute of the Novartis 
Research Foundation in San Diego. 

Costs should also be lower. Gene chips have been so expensive that many academic scientists 
still make their own rather than buy them. Affymetrix said it would sell its whole-genome chips 
for $300 to $500 each, depending on volume, little more than half the price of the two-chip set. 
The other companies have not announced prices. 

For Affymetrix, a successful whole-genome chip "is essential for them to maintain their 
dominance" of high-end microarrays, said Edward A. Tenthoff, an analyst at U.S. Bancorp Piper 
Jaffray. Affymetrix had total product sales in 2002 of about $250 million, and a company 
spokesman said that human genome chips are its top-selling product. 

Mr. Tenthoff, who recommends Affymetrix stock, said the company's sales growth rate had 
moderated as it faces tougher competition. Agilent, a spinoff of Hewlett-Packard that makes its 
gene chips by printing DNA components onto glass slides using ink jet printers, has gained 
share, he said. Applied Biosystems, the largest maker of genomics equipment over all, will be 



entering the microarray segment of the business with its whole-genome chip, emphasizing the 
connection of that product to the others it offers, including the gene database developed by its 
sister company, Celera Genomics. 

Jeffrey Trent, scientific director of the Translational Genomics Research Institute in Phoenix, 
said that while whole-genome chips are useful for medical discovery, the biggest growth of the 
market will be for chips that can be used by doctors to do diagnoses. And whole-genome chips 
are too cumbersome for that, he said. Rather, once scientists use the whole-genome chips to find 
particular genes that are associated with, say, tumor aggressiveness or drug effectiveness, he 
said, they will then make smaller and cheaper chips containing just those genes for use in 
diagnosis. 
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Agilent Technologies ships whole human genome on single 
microarray to gene expression customers for evaluation 

Company to introduce first commercial whole human microarray by end of year 
PALO ALTO, Calif., Oct. 2, 2003 
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Agilent Technologies Inc. (NYSE: A) today announced it has shipped whole human-genome microarrays 
to customers for testing and evaluation. The whole genome microarray is based on Agilent's new double- 
density format, which can accommodate 44,000 features on a single 1" x "3" glass-slide microarray. The 
new platform enables drug-discovery and disease researchers to perform whole-genome screening at a 
lower cost and with higher reproducibility. 

This is an important step toward our release of the first whole human-genome microarray product which 
is expected to be available for order before the end of the year," said Barney Saunders, vice president 
and general manager of Agilent's BioResearch Solutions Unit. M Customers have long wanted a one- 
sample, one-chip format with the increased sensitivity associated with 60-mer probes. The cost savings 
and high-quality performance make this product a compelling alternative for scientists who make their 
own microarrays.' 

Agilent's microarrays are based on the industry-standard 1" x 3" (25mm x 75mm) format, which is 
compatible with most commercial microarray scanners. All Agilent commercial microarrays are developed 
using content from public databases and proprietary sources, with full sequence and annotation 
information made available to customers. Gene sequences for probes are developed using algorithms 
and then validated empirically through iterative wet-lab testing procedures. The result is a microarray 
comprised of functionally validated probes, with the most up-to-date and comprehensive genome 
information commercially available. 

Advantages of the double-density format include: 

• Lower cost. Not only is one microarray less expensive than two, it requires fewer reagents and 
reduces instrumentation demands. 

• Streamlined workflow. Researchers need prepare and process only one microarray instead of 
two. This also results in fewer steps in the subsequent data analysis. 

• Greater reproducibility. Use of a single microarray further reduces unnecessary variability in 
experimental conditions. 

• Smaller sample use. A smaller quantity of sample material is required to perform an experiment. 
Availability 

Agilent's Whole Human Genome Microarray is expected to be available for order by the end of the year. 
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About Agilent Technologies 

Agilent Technologies Inc. (NYSE: A) is a global technology leader in communications, electronics, life 
sciences and chemical analysis. The company's 30,000 employees serve customers in more than 1 10 
countries. Agilent had net revenue of $6 billion in fiscal year 2002. Information about Agilent is available 
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on the Web at vww.aQilent.com . 
Forward-looking Statements 

This news release contains forward-looking statements (including, without limitation, statements relating 
to Agilent's expectation that its whole-genome microarray platform will be available for order before the 
end of 2003) that involve risks and uncertainties that could cause results to differ materially from 
management's current expectations. These and other risks are detailed in the company's filings with the 
Securities and Exchange Commission, including its Annual Report on Form 10-K for the year ended Oct, 
31, 2002, its Quarterly Report on Form 10-Q for the quarter ended July 31, 2003 and its Current Report 
on Form 8-K filed Aug. 18, 2003. The company assumes no obligation to update the information in this 
press release. 
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Affymetrix Announces Commercial Launch of Single Array for Human Genome 
Expression Analysis 




AFFYMETRIX GENECHIP(R) BRAND HUMAN GENOME U133 PLUS 2.0 ARRAY 



Affymetrix GeneChip(R) Brand Human Genome U133 Plus 2.0 Array. 
(PRNewsFoto)[AS] 

SANTA CLARA, CA USA 10/02/2003 




More Than 1 Million Probes Analyze Expression Levels of Nearly 50,000 RNA 
Transcripts and Variants on a Single Array the Size of a Thumbnail 

SANTA CLARA, Calif., Oct. 2 /PRNewswire/ Affymetrix, Inc., 
(Nasdaq: AFFX) announced today that it is taking orders for its new 
GeneChip(R) brand Human Genome U133 Plus 2.0 Array, offering researchers the 
protein-coding content of the human genome on a single commercially available 
catalog microarray. The HG-U133 Plus 2.0 Array analyzes the expression level 
of nearly 50,000 RNA transcripts and variants with 22 different probes per 
transcript, providing superior data quality unmatched by technologies using a 
single probe per transcript. 

(Photo: http://www.newscom.com/cgi-bin/prnh/20031002/SFTH021 ) 

"With about 1.3 million probes on a chip the size of a human thumbnail, 
the Human Plus Array represents a leap in array technology data capacity, and 
further demonstrates the unique power and potential of our technology to 
explore vast areas of the genome," said Trevor J. Nicholls, Ph.D., Chief 
Commercial Officer. "Multiple independent measurements for each transcript 
ensure that our data quality remains the industry standard, even as our data 
capacity increases dramatically." 

The HG-U133 Plus 2.0 Array, which will ship in October, combines the 
content of the previous HG-U133 two-array set with nearly 10,000 new probe 
sets representing about 6,500 new genes, for a total of nearly 50,000 RNA 
transcripts and variants. This new information, verified against the latest 
version of the publicly available genome map, provides researchers the most 
comprehensive and up-to-date genome-wide gene expression analysis. The probe 
design strategy of the HG-U133 Plus 2.0 Array is identical to the previous HG- 
U133 Set, providing very strong data concordance between the two products. 
With more than double the data capacity of the previous-generation Affymetrix 
human product, the HG-U133 Plus 2.0 Array can significantly cut processing and 
analysis time for scientists in the lab, freeing up valuable resources and 
accelerating research. 

The HG-U133 Plus 2.0 Array sets a new standard for the number of genes and 
transcripts on any commercially available single array for human gene 
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expression analysis, while maintaining Affymetrix' unrivaled data quality The 
HG-U133 Plus 2.0 Array uses 22 independent measures to detect the 
hybridization of each transcript on the array, 1.3 million data points in all 
more than 30 times that of any other microarray technology. Using multiple, 
independent measurements provides optimal sensitivity and specificity, and' the 
most accurate, consistent and statistically significant results possible. 

"More data points produce more reliable results and ultimately, enable 
better science," said Nicholls. "Our powerful probe set strategy gives our 
customers the assurance that their array results actually reflect what's in 
their sample." 

Affymetrix is also launching an updated 11-micron version of its popular 
18-micron HG-U133A Array called the GeneChip HG-U133A 2.0 Array. The reduced 
feature size on this new design means researchers can use smaller sample 
volumes than on the previous 18-micron array without compromising performance 
This new array represents over 20,000 transcripts that can be used to explore 
human biology and disease processes. All probe sets represented on the 
original GeneChip HG-U133A Array are identically replicated on the GeneChip 
HG-U133A 2.0 Array. * 

More information on the design of the HG-U133 Plus 2.0 Array and the 
HG-U133A 2.0 Array may be found on the Affymetrix website at 
http://www.affymetrix.com . 

Affymetrix will be presenting further information on this and other 
products at the BioTechnica trade show in Hanover, Germany on Oct. 7-9, 2003 
The Company will also hold a press conference on Oct. 7, from 11 a.m. to 
12 p.m. at the show regarding the new Human Genome U133 Plus 2.0 Array. If you 
would like to attend this press conference, please contact Caroline Stupnicka 
at c.stupnicka@northbankcommunications.com. 

About Affymetrix: 

Affymetrix is a pioneer in creating breakthrough tools that are driving 
the genomic revolution. By applying the principles of semiconductor technology 
to the life sciences, Affymetrix develops and commercializes systems that 
enable scientists to improve the quality of life. The Company's customers 
include pharmaceutical, biotechnology, agrichemical, diagnostics and consumer 
products companies as well as academic, government and other non-profit 
research institutes. Affymetrix offers an expanding portfolio of integrated 
products and services, including its integrated GeneChip platform, to address 
growing markets focused on understanding the relationship between genes and 
human health. Additional information on Affymetrix can be found at 
http://www.affymetrix.com . 

All statements in this press release that are not historical are 
"forward-looking statements" within the meaning of Section 21E of the 
Securities Exchange Act as amended, including statements regarding Affymetrix' 
"expectations," "beliefs," "hopes," "intentions," "strategies" or the like 
Such statements are subject to risks and uncertainties that could cause actual 
results to differ materially for Affymetrix from those projected, including, 
but not limited to risks of the Company's ability to achieve and sustain 
higher levels of revenue, higher gross margins, reduced operating expenses, 
uncertainties relating to technological approaches, manufacturing, product' 
development, market acceptance (including uncertainties relating to product 
development and market acceptance of the GeneChip HG-U133 Human Plus 2.0 Array 
and the HG-U133A 2.0), personnel retention, uncertainties related to cost and 
pricing of Affymetrix products, dependence on collaborative partners, 
uncertainties relating to sole source suppliers, uncertainties relating to FDA 
and other regulatory approvals, competition, risks relating to intellectual 
property of others and the uncertainties of patent protection and litigation. 
These and other risk factors are discussed in Affymetrix' Form 10-K for the 
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year ended December 31, 2002 and other SEC reports, including its Quarterly 
Reports on Form 10-Q for subsequent quarterly periods. Affymetrix expressly 
disclaims any obligation or undertaking to release publicly any updates or 
revisions to any forward-looking statements contained herein to reflect any 
change in Affymetrix' expectations with regard thereto or any change in 
events, conditions, or circumstances on which any such statements are based. 

NOTE: Affymetrix, the Affymetrix logo, and GeneChip and are registered 
trademarks owned or used by Affymetrix, Inc. 



SOURCE Affymetrix, Inc. 

Web Site: http://www.affymetrix.com 

Photo Notes: NewsCom: 

http://www.newscom.com/cgi-bin/prnh/20031002/SFTH021 AP Archive- 
http://photoarchive.ap.org PRN Photo Desk, 

photodesk@prnewswire.com 



Issuers of news releases and not PR Newswire are solely responsible for the accuracy of the 
content. 
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Macroresults through Microarrays 

John C. Rockett, Reproductive Toxicology Division (MD-72), National Health and Environmental Effects Research Laboratory, Office of Research 
and Development, US Environmental Protection Agency, Research Triangle Park, 2525 East Highway 54, Durham, NC 2771 1, USA; 
tel: +1 919 541 2071, fax: +1 919 541 4017, e-mail: rockett.john@epa.gov 



The third enactment of Cambridge 
Healthtech Institute's Mocroresults 
through Microarrays meeting was held 
in Boston (MA, USA) from 29 April- 
1 May 2002. The subtheme of this year's 
meeting was 'advancing drug discov- 
ery', a widely touted application for 
array technology. 

The evolution of microarrays 
If you were asked 'Who first conceived 
of the idea of microarrays', who would 
come to mind? Mark Schena perhaps, 
first author of the seminal 1995 paper 
on cDNA arrays [1]? Maybe Pat Brown, 
Schena's then supervisor? Or perhaps 
Stephen Fodor, the primary driver 
behind Affymetrix's (http://www. 
affymetrix.com) oligonudeotide-based 
platform [2]. Brits might even chant the 
name of Ed Southern [3]. Well, accord- 
ing to Roger Ekins (University College 
London Medical School; http://www. 
ucl.ac.uk/medicine/) all these answers 
would be wrong. It was in fact Ekins 
and his colleagues who first conceived 
of and patented 'a new generation of 
ultrasensitive, miniaturised assays for 
protein and DNA-RNA measurement 
based on the use of microarrays' in the 
mid 1980s [A], The concept and poten- 
tial of array technology was more fully 
described in a later publication, in 
which Ekins et at. [5] concluded that an- 
tibody microspots of -50 u.m 2 could be 
achieved, and that as many as 2 million 
different immunoassays could, in prin- 
ciple, be accommodated on a surface 
area of 1 cm 2 . 

Technological innovation 

In practice, it took a different biological 

molecule (DNA), a different research 



group, and a leap into microfabri- 
cation technology to even begin 
approaching these kinds of densities 
[Affymetrix patent 6045996 talks of 
one million spots cm- 2 ]. Of course, 
advancing technology is one of the 
driving engines behind the genomics 
juggernaut, and we are already seeing 
'4th generation' machines for fab- 
ricating DNA chips. If the company 
representatives at this meeting are to 
be believed (and their cases seemed 
strong), spotting is out, and in situ 
fabrication of oligonucleotide-based 
'iterative custom arrays' is in. Whether 
you go with the Combimatrix's (http:// 
www.combimatrix.com) electrochemi- 
cally directed synthesis and detection 
system, febit's (http://www.febit.com) 
Ceniom® technology, or Nimblegen's 
(http://www.nimblegen.com) Maskless 
Array Synthesi2er technology is a 
matter of personal choice. However, 
each of these machines provides the 
flexibility to design variable length 
oligonucleotide probes from se- 
quences inputted by the user, and then 
perform in situ synthesis of an array. 
Each system also boasts unique advan- 
tages. For example, Combimatrix's 
biological array processor is a semi- 
conductor coated with a 3D layer 
of porous material in which DNA, 
RNA, peptides or small molecules 
can be synthesized or immobilized 
within discrete test sites, while febit's 
Ceniom One® is a fully integrated 
gene-expression analysis system with 
minimal user hands-on time - the 
probe sequences are programmed, the 
RNA samples inserted, and the gene 
expression data is pumped out a few 
hours later. 



Cell- and tissue-based arrays 
Array technology is in most people's 
minds firmly linked with gene-expression 
profiling. Fewer are aware that cell- and 
tissue-based arrays have been devel- 
oped, and how they can provide 
a vital extra dimension to research. In 
support of this, Barry Bochner gave an 
update on the cell-based array system 
that Biolog (http://www.biolog.com) 
has produced for simultaneously mea- 
suring the effects of one gene in the cell 
under thousands of growth conditions 
(see [6] for further details). David Walt 
(Tufts University; http ://www. tufts, 
edu/) is developing single live cell ar- 
rays using optical imaging fiber (OIF) 
technology. An array of microwells is 
fabricated on the face of an OIF at den- 
sities of up to 10 million wells cm- 2 . 
Cells are then added to the wells and 
disperse at an average of one cell per 
well. Physiological and genetic re- 
sponses of each cell are measured via 
fluorescence produced by reporter 
genes (e.g. /ocZ, gfp. Assays performed 
so far include yeast live or dead cell 
assay, microenvironment pH and 
0 2 measurements, promotor responses 
using the loci and phoA reporter genes, 
and protein-protein interactions using 
the yeast two-hybrid system. The main 
advantage of this system is that the cells 
remain alive during the assay, which 
means a real-time timecourse can be 
performed and/or the array passed 
from sample to sample. This would be 
useful in, for example, the scanning of 
a combinatorial drug library for specific 
physiological effects. 

Tissue arrays are a useful complemen- 
tary technology to DNA arrays because 
they can be used to help validate and 
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understand the biological and medical 
significance of gene changes discov- 
ered using standard DNA arrays. For 
example, an array of tumor tissues can 
be screened for the protein (using im- 
munohistochemistry), message (using 
in situ hybridization) and copy number 
(using comparative genomic hybridiza- 
tion) of a gene of interest, to determine 
if expression of the gene (or lack 
thereof) is related in any way to sur- 
vival. They can also be used to predict 
the probability of clinical failure of lead 
compounds as a result of toxicity by 
evaluating the distribution of the drug 
targets in normal tissue. Spyro Mousses 
and his co-workers at the National 
Human Genome Research Institute 
(http://www.nhgri.nih.gov/index.html) 
have built such arrays, including a 
multi-tumor array (-5000 specimens, 
and sections from 36 normal and 800 
metastatic tissues) and a normal tissue 
array (76 tissue and 332 cell types). 

The problem with proteins 
It has been said that genomics tells us 
what might happen, transcriptomics 
indicates what should happen, and pro- 
teomics shows what is happening. The 
impact of functional proteomics on 
pharmaceutical R&D is rapidly increas- 
ing, and protein arrays are being used 
increasingly in both basic and applied 
research. Their use lies not only in com- 
parative protein expression and inter- 
action profiling, but also in diagnostics 
and drug discovery. However, an in- 
creasing number of researchers have 
found that protein arrays, like their 
cousins the DNA arrays, present several 
practical obstacles relating to their pro- 
duction and use. For example, in using 
Escherichia coli to produce recombi- 
nant eukaryotic proteins from a single 
expression vector, multiple protein 
products are often produced, suggest- 
ing mixes of truncated or otherwise 
altered proteins. There is also the obvi- 
ous concern that the proteins might 
not be modified in a similar manner to 



eukaryotic systems. Also, an optimal 
method for depositing and binding 
proteins to the selected substrate is 
yet to be determined, as is the best 
way to ensure that they are bound in a 
correctly folded, active conformation. 

Several companies have been address- 
ing these problems. Prolinx (http:// 
www.prolinxinc.com) is one such com- 
pany, and Karin Hughes described their 
Versalinx™ chemistry for producing 
protein, peptide and small-molecule 
arrays. Versalinx™ uses solution-phase 
conjugation followed by immobiliza- 
tion, resulting in functional orientation 
of proteins and peptides on the sub- 
strate surface. It also offers the valuable 
additional benefit of exhibiting low 
non-specific binding. Sense Proteomic 
(http://www.senseproteomic.com) is 
also among those addressing these 
problems to develop robust protein 
arrays for drug discovery and clinical 
applications and has developed func- 
tional protein array formats based on 
specific disease tissues. Subtractive hy- 
bridization is used to identify genes 
with altered expression in breast tumor 
and cystic fibrosis compared to normal 
tissue. A high throughput cloning strat- 
egy (COVET™) is then used to produce 
libraries of genes that are tagged, 
cloned, expressed, purified and finally 
immobilized on glass slides. Initial vali- 
dation studies have shown that the vast 
majority of the immobilized proteins do 
indeed retain biological function. 

Stefan Schmidt and his company 
(CPC Biotech; http://www.gpcbiotech. 
de) have moved past the platform devel- 
opment stage and, with their focus 
firmly on drug discovery, are currently 
developing kinase-profiling arrays. 
Kinases are important targets for phar- 
maceutical drug discovery and therapy, 
and GPC's aim is to simultaneously de- 
tect multiple kinases, obtain activity pro- 
files for different cell types, or analyze 
the ability of drug candidates to inhibit 
kinase activity. To do this, recombinant 
kinase substrates are immobilized on 
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membranes, incubated with purified 
kinase, and the-substrates measured for 
the degree of phosphorylation. 

Summary 

Meetings like this, packed with exciting 
discoveries and intriguing and interest- 
ing innovation, heavily emphasize the 
pace at which biotechnology is advanc- 
ing, to the extent that the number of 
options for genomic and proteomic re- 
searchers can become overwhelming. 
Although data analysis is perhaps the 
greatest current concern for array users, 
an increasing challenge will be to deter- 
mine the approaches and technology 
that really work, and to do it in a timely 
manner. 
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A two-dimensional gel database of rat liver proteins 
useful in gene regulation and drug effects studies 

A standard two-dimensional (2-D) protein map of Fischer 344 rat liver 
(F344MST3) is presented, with a tabular listing of more than 1200 protein species. 
Sodium dodecyl sulfate (SDS) molecular mass and isoelectric point have been es- 
tablished, based on positions of numerous internal standards. This map has been 
used to connect and compare hundreds of 2-D gels of rat liver samples from a va- 
riety of studies, and forms the nucleus of an expanding database describing rat 
liver proteins and their regulation by various drugs and toxic agents. An example 
of such a study, involving regulation of cholesterol synthesis by cholesterol-lower- 
ing .drugs and a high-cholesterol diet, is presented. Since the map has been ob- 
tained with a widely used and highly reproducible 2-D gel system (the Iso-Dalt* 
system), it can be directly related to an expanding body of work in other laborato- 
ries. 
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1 Introduction 

High-resolution two-dimensional electrophoresis of pro- 
teins, introduced in 1975 by OTarrell and others 1 1—4], has 
been used over the ensuing 1* years to examine a wide va- 
riety of biological systems, the results appearing in more 
than 5000 published papers. With the advent of computer- 
ized systems for analyzing rwo-dimensional (2-D) gel ima- 
ges and constructing spot databases, it is also possible to 
plan and assemble integrated bodies of information de- 
scribing the appearance and regulation of thousands of pro- 
tein gene products [5, 6). Creating such databases involves 
amassing and organizing quantitative data from thousands 
of 2-D gels, and requires a substantial commitment in tech- 
nology and resources. 

Given the long-term effort required to develop a protein da- 
tabase, the choice of a biological system takes on consider- 
able importance. While in vtrro systems are ideal foranswer- 
tng nuny experimental questions, especially in cancer re- 
search and genetics, our experience with cell cultures and 
tissue samples suggests that some tn vivo approaches could 
have major advantages. In particular, we have noticed that 
liver tissue samples from rats and mice appear to show grea- 
ter quantitative reproducibility (in terms of individual pro- 
tein expression) than replicate cell cultures.This is perhaps 
a natural result of the homeostasis maintained in a com- 
plete animal vs. the well-known variability of cell cultures, 
the latter due principally to diflerences in reagents (e\j„ 
fetal bovine serum), conditions lev,*.. pH ) and genetic -evo- 
lution^ of cell lines while in culture. It is also more difficult 
to generate adequate amounts of protein from cell culture 
systems (particularly with attached cells), forcing the inves- 
tigator to resort to radiotsotope-based or silver-based stain- 
detection methods. While these methods are more sensi- 
tive (sometimes much more sensitive) than the Coomassie 
Brilliant Blue (CBB) stain typically used for protein detec- 
tion in *large*protein samples, they are generally more vari- 
able, more labor-intensive and. in the case of radiographic 
methods, may generate highly "noisy" images, due to the 
properties of the films used. By contrast, large protein sam- 
ples can easily be prepared from liver using urea/Nonidet 
P-tO (NP-40) solubilization and stained with CBB, which 
has the advantage of being easily reproducible [8J. Finally, 
there remains the question of the "truthfulness" of many in 
vitro systems as compared to their in vivo analogs; how 
great are the changes caused by the introduction into a cul- 
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ture and the associated shift to strong selection for growth, 
and how do these a/Tea experimental outcomes? Hence 
the apparent advantages of in vitro systems, in terms of ex- 
perimental manipulation, may be counterbalanced by 
other factors relating to 2-D data quality. 

There is a second important class of reasons for exploring 
the use of an in vivo biological system such as the liver. His- 
torically, there have been rwo broad approaches to the me- 
chanistic dissection of biochemical processes in intact cel- 
lular systems: genetics <a search for informative mutants) 
and the use of chemical agents (drugs and chemical toxins). 
Both approaches help us 10 understand complex svstems 
by disrupting some specific functional element and show- 
ing us the result. With the development of techniques for 
genetic manipulation and cloning, the genetic approach 
can be effectively applied either in vitro or in vivo, although 
the in vitro route is usually quicker. The chemical approach 
can also be applied to either son of biological system; here, 
however, the bulk of consistently acquired information is 
in experimental animals (rats and mice). While most biolo- 
gists know a short list of compounds having specific, experi- 
mentally useful efTects <e\g.. inhibitors of protein synthesis, 
lonophores, polymerase inhibitors, channel blockers, nu- 
cleotide analogs. and compounds affecting polymerization 
of cytoskeletal proteins), there is a much larger number of 
interesting chemically-induced efTects. most of them char- 
acterized by toxicologisis and pharmacologists in rodeni 
systems. Just as a thorough genetic analysis would involve 
saturating a genome with mutations, it is possible to ima- 
gine a saturating number of drugs, the analysis of whose ac- 
tions would reveal the complete biochemistry of the cell. 
While organized drug discovery efTons usually target spe- 
cific desired efTects. the nature of the process, with its de- 
pendence on screening large numbers of compounds, ne- 
cessarily produces many unanticipated effects' It is there- 
fore reasonable to suppose thai the required broad range of 
compounds necessary to achieve -biochemical saturation* 
may be forthcoming; in fact, it may already exist among the 
hundreds of thousands of compounds that failed to qualify 
as dmgs. 



Among organs, the liver is an obvious choice for the study 
of chemical effects because of its well-known plasticity and 
responsiveness. The brain appears to be quite plastic (e.g. 
[7]), but it is a complicated mixture of cell types requiring 
skillful dissection for most experiments. The kidney, while 
quite responsive, also presents a potentially confounding 
mixture of cell types. The liver, by contrast, is made up of 
one predominant cell type which is easy to solubilize: the 
hepatocyte, representing more than 95% of its mass. Most 
importantly, the liver performs many homeostatic func- 
tions that require rapid modulation of gene expression. It 
appears that most chemical agents tested affect gene ex- 
pression in the liver at some dosage (N. Leigh Anderson, 
unpublished observations), an interesting contrast to our 
earlier work with lymphocytes, for example, which seem to 
be much less responsive. Such results conform to the expec- 
tation that cells with a homeostatic, physiological role 
should be more plastic than cells differentiated for a pur- 
pose dependent on the action of a limited number of spe- 
cific genes. 

The liver also allows the parallels between in vitro and in 
vivo systems to be examined in detail. Significant progress 



has been made in the development of mous- r., 
man hepatocyte culture systems.as well as in or£ * nc n - 
tissue slices. Using such an array of techm full 
ble to assemble a matrix of mammalian svstems ' ' 
mouse and rat in vivo on one level and mouse ra, t 
man in vitro on a second level, and to compare efTV^ — 
tween species and between systems. This approach • ^ 
us to draw informed conclusions regarding the biodL. 
-universality' of biological responses amone the mz^r 
and to offer some insight into the validity of ,„ * > 
proaches for toxicological screening. We believe ih 
will be necessary if , n vitro alternatives are to achieve u ^ 
usage tn government-mandated safetvtestine of dru L U: 
sumer products and industrial and agricultural 

A number of interesting studies have been published u«» 
2-D mapping to examine efTects in the rodent liver a ml "' 
ber of investigarors have made use of the technio L " 
screen for existing genetic variants [8-11] or induced mV 
lions (12-14). mainly in the mouse.This work builds on tol" 
wealth of genetic information available on the mouse a£ 
us established position as a mammalian mutaiion-det- 
lion system. While some studies of chemical effects ha!-" 

n c n £^ en in thc mousc ,1S ^ 7 J< mosi hav * wed th* 
rat [lS-23].The examination of the cytochrome p-450 sv*. 

on m th?^ b " n CarT " Cd ° UI alm ° Sl cxclus,vcl > 

These considerations lead us to conclude thai rodent liv- 
offers the best opportunity to systematically examine ^ 
array of gene regulation systems, and ultimately to build a 
predictive model of large-scale mammalian gene control 
The basic underlying foundation of such a project is a reli- 
able, reproducible master 2-D pattern of liver, to which on- 
going experimental results can be referred. In this paper, we 
report such a master pattern for the acidic and neutral pro- 
teins of rat liver (pattern F344MST3). In future, this master 
will be supplemented by maps of basic proteins, and analog- 
ous maps of mouse and human liver. 

2 Materials and methods 
2.1 Sample preparation 

Liver is an ideal sample material for most biochemical stud- 
ies, including 2-D analysis. A sample is taken of approxima- 
tely 0.5 g of tissue from the apical end of the left lobe of the 
liver. Solubilization is effected as rapidly as practical; a 
delay of 5-15 min appears to cause no major alteration in 
liver protein composition if the liver pieces are kept cold 
(e.g., on ice) in the interim. In the solubilization process, 
the liver sample is weighed, placed in a glass homogenizer 
(e.g. % 15 mL Wheaton); 8 volumes of solubilizing solution* 

• The solubilizing solution is composed of J% NP-40 (Sigma).* * ufW 
(analytical grade, rjr.. BDH or Bio-Rad). 0.5%. dithiothreitol < n TT: 
Sigma) and 2 %> earner ampholytes (pH 9- 1 1 LKB: these come asa?0* 
stock solution, so 2 % final concentration is achieved by making inefinal 
solution 10<*o 9—1 1 Amphoiine by volume). A large batch of sol«t» ,i,tf 
(several hundred mL) is made and stored frozen at -80 °C in alifl* 00 
sufficient to provide enough for one day's estimated sample pr*P* ,r 
lion requirement. The solution is never allowed to become vti ^£ 
than room temperature at any stage during preparation or lhl*W *f 
use. since heating of concentrated urea solutions can produce cc 
nants that covalenOy modify proteins producing artifactual 
shifts. Once thawed, any unused solubilizer is discarded. 
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(i.e., 4 mL per 0 J g tissue) and the mixture is ho- 
tied using first the loose* and then then the tight-fit- 
J glass pestle. This takes approximately 5 strokes with 
& pestle and is carried out at room temperature because 
c£ would crystallize out in the cold. Once the liver sample 
thoroughly homogenized in the s0lubiii2er.it is assumed 
it all the proteins are denatured (by the chaotropic effect 
the urea and NP-40 detergent) and the enzymes inacti- 
jedby the high pH (-9J). Therefore these samples may 
• kept at room temperature until they can be centrifuged 
•frozen as a group (within several hours of preparation), 
ic samples are centrifuged for 6 X 10* $ min [e.g.. 500 000 
f for 12 min using a Beckman TL-100 centrifuge). The 
inrifuge rotor is maintained at just below room tempera- 
je (e.g., 15-20°C), but not too cold, so as to prevent the 
■eripitaiion of urea. The centrifuge of choice is a Beckman 
LrlOO because of the sample tube sizes available, but any 
•tracentrifuge accepting smallish tubes will suffice. When 
1 appropriate centrifuge is not available near the site of 
imple preparation, samples can be frozen at -80 C C and 
tawed prior to cemrifugaiion and collection of superna- 
ints.Each supernatant is carefully removed following cen- 
ifugation and aliquoted into at least 4 clean tubes for stor- 
ge.This is done by transferring all the supernatant to one 
lean tube, mixing this gently (to assure homogeneous 
opposition) and then dividing it into 4 aliquots. The ali- 
uots are frozen immediately at -80°C. These multiple ali- 
uots can provide insurance against a failed run or a freezer 
■reakdown. 

v» 

'.2* Two-dimensional electrophoresis 

Jample proteins are resolved, by 2-D electrophoresis using 
he 20 X 25 cm lso-Dalt* 2-D gel system ((26-29]; pro- 
ceed by LSB and by Hoefer Scientific Instruments, San 
Francisco) operating with 20 gels per batch. All first-dimen- 
iional isoelectric focusing (IEF) gels are prepared using the 
lame single standardized batch of carrier ampholytes 
JDH 4-8 A in the present case, selected by LSB's batch- 
ssting program for rat and mouse database work M ). A 10 
jSTsample of solubilized liver protein is applied to each gel, 
and the gels are run for 33 000 to 34500 volt-hours using a 
progressively increasing voltage protocol implemented by 
programmable high-voltage power supply. An Ange- 
lique* computer-controlled gradient-casting system (pro- 
duced by LSB) is used to prepare second-dimensional sod- 
ium dodecyl sulfate (SDS) polyacrylamide gradient slab 
Eels in which the top 5 % of the gel is 1 1 %T acrylamide, and 
toe lower 95% of the gel varies linearly from 11% to 18%T. 
If: 

gus system has recently been modified so as to employ a 
gmmercially available 30.8 %T acrylamide/A'.A'-raethyle- 
nebisacrylamide prepared solution (thus avoiding the han- 
ging of the solid acrylamide monomer) and three addi- 
tional stock solutions: buffer (made from Sigma pre-set 
ftis), persulfate and • AW^AMetramethylethylenedi- 
.^ine (TEMED). Each gel is identified by a computer- 
ised filler paper label polymerized into the lower left cor- 
~^of the gel. First-dimensional IEF tube gels are loaded 

' material (succeed ins certified batches of which are available from 
'oefer Scientific Instruments) has the most linear pH gradient pro- 
ceed by any ampholyte tested except for the Pharmacia wide range 
which has an unacceptable tendency to bind high-molecular weight 
die proteins, causing them to streak). 



directly (as extruded) onto the slab gels without equilibra- 
tion, and held in place by polyester fabric wedges (Wed- 
gies'*, produced by LSB) to avoid the use of hot agarose. 
Second-dimensional slab gels are run overnight, in groups 
of 20, in cooled DALT tanks (10°C) with buffer circulation. 
All run. parameters, reagent source and lot information, 
and notations of deviation from expected results are ente- 
red by the technician responsible on a detailed, multi-page 
record of the experiment. 

23 Staining 

Following SDS-electrophoresis, slab gels are stained for 
protein using a colloidal Coomassie Blue G-250 procedure 
in covered plastic boxes, with 10 gels (totalling approxima- 
tely 1 L of gel) per box. This procedure (based on the work 
of NeuhofT[30,31]) involves fixation in 1.5 L of 50% etha- 
nol and 2% phosphoric acid for 2 h. three 30 min washes, 
each in 2 L of cold lap water, and transfer to 1.5 L of 34% 
methanol, 17% ammonium sulfate and 2% phosphoric acid 
for 1 h. followed by the addition of a gram of powdered Coo- 
massie Blue G-250 stain. Staining requires approximately 4 
days to reach equilibrium intensity, whereupon gels are 
transferred to cool up water and their surfaces rinsed to re- 
move any particulate stain prior to scanning. Gels may be 
kept for several months in water with added sodium azide. 
The water washes remove ethanol that would dissolve the 
stain (and render the system noncolloidal. with high back- 
grounds). The concentrated ammonium sulfate and meth- 
anol solution is diluted by equilibration with the water vol- 
ume of the gels to automatically achieve the correct final 
concentrations for colloidal staining. Practical advantages 
of this staining approach can be summarized as follows : (i) 
the low, flat background makes computer evaluation of 
small spots (max OD < 0.02) possible, especially when 
using laser densitometry; («) up to 1500 spots can be reli- 
ably detected on many gels (e.g.. rat liver) at loadings low 
enough to preserve excellent resolution; and (iii) reprodu- 
cibility appears to be very good: at least several hundred 
spots have coefficients of reproducibility less than 15%. 
This value is at least as good as previous CBB methods, and 
significantly better than many silver stain systems. 

2.4 Positional standardization 

The carbamylated rabbit muscle creatine phosphokinase 
(CPK) standards [32] are purchased from Pharmacia and 
BDH. Amino acid compositions, and numbers of residues 
present in proteins used for internal standardization, are 
taken from the Protein Identification Resource (PIR) se- 
quence database [33]. 



2.5 Computer analysis 

Stained slab gels are digitized in red light at 134 micron re- 
solution, using either a Molecular Dynamics laser scanner 
(with pixel sampling) or an Eikonix 78/99 CCD scanner. 
Raw digitized gel images are archived on high-density DAT 
tape (or equivalent storage media) and a greyscale video- 
print prepared from the raw digital image as hard-copy 
backup of the gel image. Gels are processed using the Kep- 
ler* software system (produced by LSB), a commercially 
available workstation-based software package built on 
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some of the principles of the earlier TYCHO system [34- 
41). Procedure PROC008 is used to yield a spotlist giving 
position, shape and density information for each detected 
spot. This procedure makes use of digital filtering, mathe- 
matical morphology techniques and digital masking to re- 
move the background, and uses full 2-D least-squares opii- 
miration to refine the parameters of a 2-D Gaussian shape 
for each spot. Processing parameters and file locations are 
stored in a relational database, while various log files detail- 
ing operation of the automatic analysis software are ar- 
chived with the reduced data.The computed resolution and 
level of Gaussian convergence of each gel are inspected 
and archived for quality control purposes. 

Experiment packages are constructed using the Kepler ex- 
periment definition database to assemble groups of 2-D 
patterns corresponding to the experimental groups (e g 
treated and control animals). Each 2-D pattern is matched 
to the appropriate 'master 2-D pattern (pattern 
F344MST3 in the case of Fischer 344 rat liver), thereby 
providing linkage to the existing rodent protein 2-D data- 
bases. The software allows experiments containing hun- 
dreds of gels to be constructed and analyzed as a unit, with 
up to 100 gels displayed on the screen at one time for com- 
parative purposes and multiple pages to accommodate ex- 
periments of > 1000 gels. For each treatment, proteins 
showing significant quantitative differences vs. appropriate 
controls are selected using group-wise statistical parame- 
ters (e.g.. Student's t-test, Kepler* procedure STUDENT). 
Proteins satisfying various quantitative criteria (such as P< 
0.001 difference from appropriate controls) are repre- 
sented as highlighted spots onscreen or on computer-plot- 
ted protein maps and stored as spot populations (i.e., logi- 
cal vectors) in a liver protein database. Quantitative data 
(spot parameters, statistical or other computed values) are 
stored as real- valued vectors in the database. Analysis of co- 
regulation is performed using a Pierson product-moment 
correlation (Kepler procedure CORREL) to determine 
whether groups of proteins are coordinately regulated by 
any of the treatments. Such groups can be presented graphi- 
cally on a protein roap,and reported together with the statis- 
tical criteria used to assess the level of coregulation. Multi- 
variate statistical analysis (e.g., principal components 1 ana- 
lysis) is performed on data exported to SAS (SAS Institute) 
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ceuticals, ground and mixed with th* h;.. 

late in the control diet). Animal work was ei52? u ? 
etiological Associates (Bethesda, MD) a!Ei 
chmatized I for one week on the con rol * ieVS? " tTt ^ 
tro diets for one week, and sacrificed on ?\° rC0 * 
daily doses of lovastatin and cholestyramine in al^* 
groups were 37 mg/kg/day and 5 g/kg/da y S^""* 
based on the weight of the food consumed Liv? PCCUvc, >' 
were collected and prepared for 2-D^Ton^ 
»ng to the standard liver protocol (homo^n ' SlSacco *' 
volumes of 9 m urea, 2% NP-40 0 ?S^!t • m ? lIOn In * 
LKB pH 9-11 carrier amphcJ^ 
tion for 30 min at 80000 X *) v, : d„t ^ • * Cc ntrif U g a . 
samples were frozen. 

and the data was analyzed using the Ks^^S^' 
were scaled, to remove the effect ofdZ;«^ Slcin - Ge, » 



2.6 Graphical data ootpot 

Graphical results are prepared in GKS and translated 
within Kepler* into output for any of a variety of devices 
Linedrawing output is typically prepared as Postscript and 
printed on an Apple LaserWriter. Detailed maps presented 
here have been generated using an ultra-high-resolution 
Postscript-compatible Linotronic output device. Greyscale 
graphics are reproduced from the workstation screen using 
a Seikosha videoprinter. Patterns are shown in the standard 
orientation, with high molecular mass at the top and acidic 
proteins to the left. 

2.7 Experiment LSBC04 

In the study described here 12-week-old Charles Rjver 
male F344 rats were used. Diets were prepared at LSB 
based on a Purina 5755M Basal Purified Diet. Lovastatin 
and cholestyramine were obtained as prescription pharma- 



and the daIa u, as analyzed using the Kepler*""!' r 

» remove the effect of differences " Qt 
loading, by setting the summed abundances of Z ^ 
ber of matched spots equal for each ^£SE7 

3 Results and discussion 
3.1 The rat liver protein 2-D map 

F344MST3 is a standard 2-D pattern of rat i;..., - 
based on the Fischer 344 strain^hnatiern w! ? 
from a single 2-D gel and exteosS/S ^ "expS 
ment companng it to a range of protein loads so « .« 
elude both small spots and ^i^S^£S£S 
h.gh-abundance spots. More than 700 rat SSSSSJ 
have been matched to F344MST3 in a series of dm effe™ 
end protein characterization experiments, and numemul 

EL'S?! ,ndUCed , by SPeCif,C dru * s - 'or instance" ha« 
been added as a result. A modified version including addi- 
t onal spots present ,n the Sprague-Dau lev outbred rat has 
also been developed (data not shown). Figure 1 shows a 
greyscale representation and Fig. 2 a schematic plot of the 
master pattern More than 1200 spots are included, most of 
which are visible on typical gels loaded with lOuLofsolubi- 
lized liver protein prepared by the standard method and 
stained with colloidal Coomassie Blue. Master spot num- 
bers (MSN s) have been assigned to all proteins, and ap- 
pear in the following figures, each showing one quadrant of 
the pattern. Figure 3 shows the upper left (acidic, high 
molecular mass) quadrant. Fig. 4 the upper right (basic, 
high molecu lar mass) quadrant. Fig. 5 the lower left (acidic, 
ow molecular mass) quadrant, and Fig. 6 the lower right 
(basic, low molecular mass) quadrant. The quadrants over- 
Inn" 3n a ' d t0 movin * b« w ecn them. The gel position (in 
100 micron units), isoelectric point (relative to the CPK in- 
ternal pi standards) and SDS molecular mass (from the cali- 
bration curve in Fig. 8) are listed for each spot (Table 1). Be- 
cause of the precision of the CPK-p/ values, these parame- 
ters can be used to relate spot locations between gel sys- 
tems more reliably than using p/ measurements expressed 
as pH. A major objective of current studies is the identifica- 
tion of all major spots corresponding to known liver pro- 
teins, as well as rigorous definitions of subcellular org*- 
nelle contents. Of particular interest to us is the parallel de- 
velopment of identifications in the rat and mouse liver 
maps, allowing detailed comparisons of gene expression ef- 
fects in the two systems.The results of these studies will be 
presented systematically in a later edition of this database. 



{we include here a useful series of 22 orienting identifi- 
j{onsas an aid to other users of the rat liver pattern (Table 

J5 
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2.Carbamylated charge standards, computed pfs and 
* molecular mass standardization 

x 

febave previously shown that the use of a system of dose- 
Spaced internal pi markers (made by carbamylating a 
isic protein) offers an accurate and workable solution to 
ie problem of assigning positions in the p/dimension [32]. 
be same system, based on 36 protein species made by car- 
ainylating rabbit muscle CPK- has been used here to as- 
:gn p/*s to most rat liver acidic and neutral proteins. The 
raodards were coelectrophoresed with total liver proteins, 
ad the standard spots added to a special version of the 
aaster pattern F344MST3. The gel J-coordinates of all 
: ver protein spots lying within the CPK charge train were 
hen transformed into CPK pi positions by interpolation 
)dween the positions of immediately adjacent standards 
Table 1) using a Kepler* vector procedure. 
•• 
*** 

t has proven possible to compute fairly accurate p! values 
of many proteins from the amino acid composition (42]. 
Vt have attempted here to test a further elaboration of this 
tpproach.in which we computed pAsforthe CPK standards 
±emselves, based on our knowledge of the rabbit muscle 
CPK sequence and the fact that adjacent members of the 
iarge train typically differ by blockage of one additional ly- 
sine residue (Table 3). We compared these values to similar 
computed p/s for an additional set of carbamylaied stand- 
ards made from human hemoglobin beta chains and a se- 
ries of rat liver and human plasma proteins of known posi- 
tion and sequence (Fig. 7, Table 4).Tbe result demonstrates 
good concordance between these systems. Two proteins 
show significant deviations: liver fatty-acid binding protein 
(FABP; #1 in Table 4) and protein disulphide isomerase 
(#20 in the table). The FABP spot present on F344MST3 
gay represent a charge-modified version of a more basic 
parent spot closer to the expected p/, not resolved in the 
3EF/SDS gel. Of particular importance is the fact that, by 
Smparing computed pfs of sequenced but unlocated pro- 
teins with the CPK pA, we can assign a probable gel loca- 
jjpn without making any assumptions regarding the actual 
gel pH gradient. This offers a useful shortcut, given the va- 
garies of pH measurement on small diameter IEF gels. We 
|ave used this approach to compute the CPK pfs of all rat 
Sd mouse proteins in the PI R sequence database, as an aid 
protein identification (data not shown). 

Torderto standardize SDS molecular weight (SDS-MW), 
have used a standard curve fitted to a series of identified 
Jjoteins (Fig. 8). Rather than using molecular mass perse, 
?e have elected to use the number of amino acids in the 
Polypeptide chain, as perhaps a better indication of the 
Jgth of the SDS-coated rod that is sieved by the second 
ension slab. The resulting values were multiplied by 
(the weighted average mass of amino acids in se- 
lced proteins) to give predicted molecular masses. Be- 
se we use gradient slabs, we have not constrained the fit- 
curve to conform to any predetermined model; rather 
tried many equations and selected the best using the 
'gram "Tablecurve-'on a PC. The equation chosen was> 
+ tor+ cAr\ where .vis the number of residues,x is the gel 



^coordinate, a is 5 1 1 .83, £ is -0 J73 1 and c is 33 183801 . The 
resulting fit appears to be fairly good over a broad range of 
molecular mass. 



3 J An example of rat liver gene regulation: Cholesterol 
metabolism 

Experiment LSBC04 was designed as a small-scale test of 
the regulation of cholesterol metabolism in vivo by three 
agents included in the diet: lovastatin (Mevacor*,an inhibi- 
tor of HMG-CoA reductase); cholestyramine (a bile acid 
sequestrant that has the effect of removing cholesterol 
from the gut-liver recirculation); and cholesterol itself. The 
first two agents should lower available cholesterol and the 
third should raise it, allowing manipulation of relevant 
gene expression control systems in both directions. Such 
an experiment offers an interesting test of the 2-D mapping 
system since most of the pathway enzymes are present in 
low abundance, many are membrane-bound and difficult 
to solubtlize,and the pathway itself is complex. Approxima- 
tely 1000 proteins were separated and detected in liver ho- 
mogenates. Twenty-one proteins were found to be affected 
by at least one treatment, and these could be divided into 
several coregulated groups. 

3.3.1 MSN 413 (putative cytosolic HMG-CoA synthase) 
and sets of spots regulated coordinate!) or inversely 

One group of spots (including a spot assigned to the cyto- 
solic HMG-CoA synthase, MSN 413) showed the expected 
increase in abundance with lovastatin or cholestyramine, 
the synergistic further increase with lovastatin and choles-. 
tyramine,and a dramatic decrease with the high cholesterol 
diet. Spot number 413 is the most strongly regulated pro- 
tein in the present experiment, showing a 5- to 10-fold in- 
duction after a 1 week treatment with 0.075 % lovastatin and 
1% cholestyramine in the diet (Figs. 9 and 10). Its expres- 
sion follows precisely the expectation for an enzyme whose 
abundance is controlled by the cholesterol level; it is pro- 
gressively increased from the control levels by cholestyra- 
mine, lovastatin and lovastatin plus cholestyramine, and it 
sinks below the threshold of detection in animals fed the 
high cholesterol diet. This spot has been tentatively identi- 
fied as the cytosolic HMG-CoA synthase, based on a reac- 
tion with an antiserum to that protein provided by Dr. Mi- 
chael Greenspan at Merck Sharp & Dohme Research Labo- 
ratories. This enzyme lies immediately before HMG-CoA 
reductase in the liver cholesterol biosynthesis pathway, and 
is known to be co-regulated with it. Spot 413 has an SDS 
molecular weight of about 54 000 and a CPK pi of - 11 .4, in 
reasonably close agreement with a molecular weight of 
57300 and a CPK pi of -15.7 computed from the known se- 
quence of the hamster enzyme 143]. 

Using a classical product-moment correlation test (Kepler 
procedure CORREL), a series of five additional spots was 
found to be coregulated with 413. The level of correlation 
was exceedingly high (> 95%). Two of these, 1250 and 933, 
are at similar molecular weights and approximately one 
charge more acidic than 413 (Fig. 9), indicating that they 
may be covalenlly modified forms of the 413 polypeptide. 
This suspicion is strengthened by the observation that both 
spots are also stained by the antibody to cytosolic HMG- 
CoA synthase. The remaining three correlated spots appear 



to comprise as additional related pair 0253 and 100» of 
around 40 kDa and , single spot (1119) ofaro" d 28 
Because these two presumed proteins are presem at sub- 

fv« ?r ?" C ° A ?' n ^ aSe l$ reponed 10 con$i « of onlvone 
type of pol>-pept.de, they are likely to represent other.* ven 
tightly coregulaied enzymes. A second group of s ix spoS 
was selected based on a regulatory pattern close to the in- 
verse of that for spot 413 (MSN's 34. 79. 178. 182. 204 347 
data not shown). For these proteins, the lowest level of ex", 
pression occurs with exposure to lovastatin plus Cholestin, 
mine and the highest level upon exposure to the high-cho- 
lesterol diet. Spots 182 and 79 are highly correlated and Te 
about one charge apart at the same molecular weight- they 
may thus be isoforms of a single protein. The other' four 
spots probably represent additional enzymes or subunits. 

3.3.2 MSN 235 and coregulaied spots 

A third group of five spots, mainly comprised of mitochon- 
drial proteins including putative mitochondrial HMG- 
CoA synthase spots, showed a modest induction by lovasta- 
un alone, but little or no effect with any of the other treat- 
ments (including the combination of lovastatin and choles- 
tyramine; Fig. 12). This result is intriguing because lovasta- 
tin was expected to affect only the regulation Of enzvmes of 

$y r m K CSIS - WhiCh is enlircl > Mtra-miiochon- 

° f SP ° lS (235> 1H 144) form a c,0 "'v 
packed triad at approximately 30 kDa. and are likely to re- 
present isoforms of one protein. All three spots are stained 
by an antibody to the mitochondrial form of HMG-CoA 
synthase obtained from Dr. Greenspan. Subcellular fractio- 
nation indicates a mitochondrial location. The other two 
spots (633 at about 38 kDa and 724 at about 69 kDa) are 
each present at lower abundance than the members of the 



proteins of the putative mitochondrial na .h 
muchmore variable in theirexpreSS a5 are « 
animation of all the coregula.eo group, su«iuT An * 
mauve statistical techniques can J^JSSV^ 

estingmformationfrom large seuofreproducibl eB !f ,n ^'- 
abundance of spots in the 413 coregulation grou D fof 7115 
ple.shows an amazing level of concordance ir . •£ tx * m - 
expression among the five individuals of the™oW h,nt 
cholestyramine treatment group. This effect I SJ? int 
differences in total protein loading, since thev have ,!)* ,0 
been removed by scaling, and since proteins wii li- 
ferent regulation patterns can be demonstrate ? Wt dif - 
3 .Such effects raise the pcMk!S!SSS^ F * 
lation sets may be revealed throu'gh th^uf " r gu ' 
ciently large population of control anima ^i^l^ 

M y«penmenulmanipulation).Thisapproach'«2 OU! 
natural biological variation in nrotein «nr«. "P'O'i'nj 

drug effects, offers an im^SSSS^^ * 
tion of a large library of control animal patterns UC " 



4 Conclusions 

Because of the widespread use of rat liver in both basic bin. 
chemistry and in toxicology, there is a long-tem , need f t 
comprehend database ofliver proteins. The™, < 
ter patiern presented here has proven to be an a curat? ' 

ESS" 0 ? ° f ? " S>Slem - bav,n * been ™«ched ?o mo" 
than 700 gels 10 date. As the number of proteins identified 
and the number of compounds tested for gene expre sioj 
effects grows, we expect this database to contribute vS„ 
able insights into gene regulation. Its practical utilitv.nsev- 
monstmed meChan ' Slic Oology « already being de- 
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333 An example of an anti-synergistic effect 

A sixth spot (367) shows strong induction by lovastatin 
two- to threefold), and about half as much induction w th 
lovastatin plus cholestyramine, but without sharing the ani- 
mal-animal heterogeneity pattern of the 235-set (Fie 13) 
This protein is also mitochondrial. and represents the clear- 
est example of an anti-synergistic effect of lovastatin and 

?™£ £ a . m,ne - ^ " iSlenCe of such an effe « °™ioo- 
stra es that lovastatin and cholestyramine do not act exclu- 
sively through the same regulatory pathway. 

3 3.4 Complexity of the cholesterol synthesis pathway 

««« l08 f ther,thes l resu,ts sugge$t tnat ire*mem with lo- 
nSl al0ne "u^* 0 * Cyt0S0,ic and mitochondrial 
SfneEd^"^ 0 * 00 ' 1 ' WWle cho, ««yran,ine. on the 
other hand, either atone or in combination with lovastatin 

S 8 Str ° n i CffeCt ° n thC puU,ive Pathway.' 
but little or no effect on the putative mitochondrial path 
way. An explanation for this difference may lie in lovasta. 
o^f^ °l ,CVe,S of HMG'CoA and related precuni 
,h ZTt* ^ 8re cxcnan 8 ed between the cyiosol and 
the mitochondrion, whereas cholestyramine should affect 
only the cytosolic pathways directly controlled by cholester- 
ol and bile acid levels. It remains to be explained why some 
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ture 4. Upper right (high molecular weight, basic) quadrant <#2) of the rat liver map. showing spot number*. 
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human hemoglobin 6 dum. filled diamonds) and seven othe r J J 
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fi/iw 9. Montage showing effects in ibt 
region of MSN:413. The montage sho**a 
smaJI window into one portion of the 3-D 
pattern. one row of windows for each eu** 
rimemal group, and one panel for eachgtl 
in the experiment. The lefi-most patten 
in each row is a group-specific copy of thr 
master pattern followed by the pattern? 
for the five individual rats in the group. 
The highlighted protein spots (filled off 
les) are spot 4 1 3 (on the right of each 
el; identified as cytosolic HMG-CoA*T* 
thase) and two modified forms of it 0** 
and 933). From the top. the rows (eir*"" 
menu! groups) are: high cholesterol.*** 
trots, cholestyramine, lovastatin, and lo*» 
statin plus cholestyramine. 
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Fifure 10. Bareraph showing the quantita- 
tive efTecis of various treatments on ihc 
abundance of MSN :4 13 (eyioiolic HMG- 
Co A synthase) in the gels of Fig 9 
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Ftgure / A Bar graphs of a series of six core- 
gulated spots including MSN:413. In the 
bargraphs. the abundances of the appro- 
priate spot (mister spot number shown at 
the top of the panel) in each animaJ are 
shown. The five five-animal groups are in 
the order (left to right): high cholesterol, 
controls, cholestyramine, lovastatin, and 
lovastaiin plus cholestyramine. Each bar 
within a group represents one experimen- 
tal animal liver (one 2-D gel). Note the cor- 
related expression of the 6 spots, espe- 
cially in the two far right (most strongly in- 
duced) groups. 




Ftfur, Data on a second corci ulaitr 
croup of spots, presenile as ir. Ftg )] T:.! 
fourth experimental group <lovasta;:r 
shows a modes! induction, while the fifir 
group (lovasunn plus cholesivraminf • 
does not. 




Figure J J. Data on spot MSN:367, presented as in Fig. 1 1. This P*** 
shows unambiguously the anti-synergistic effect oflovastatm and ©>*J~ 
tyramine (fifth group) as compared to lovastatin (fourth group) T* i,,f ^ 
ponse contrasts strongly with the regulation pattern seen m Fag. 
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'11 
15 
17 
18 
19 
20 
71 
22 
23 
24 
25 
27 



311 
568 
812 
549 
845 
629 



29 
30 
32 
33 
34 
35 
36 
38 
39 
41 
42 
43 
44 
46 
47 



51 
52 
~53 
54 
55 
56 
57 
56 
59 
60 
61 
62 
65 
66 
67 
66 
60 
71 
72 
73 
74 
75 
76 
77 
7B 
79 
80 
61 
62 
63 
84 
65 
86 
87 
88 
89 
90 
91 
92 
S3 
94 



755 
649 
1204 
332 
787 
313 
807 
1164 
1263 
743 
768 
1216 
1145 
1037 
863 
712 
763 
304 
1165 
684 
1318 
1924 
1203 
1391 
309 
605 
621 
1113 
1820 
725 
2001 
722 
678 
1682 
1091 
1171 
1400 
1853 
1888 
735 
1263 
12S2 
779 
1064 
656 
636 
1582 
1570 
1264 
1338 
1833 
1767 
925 
534 
1811 
1412 
1471 
1662 
1596 
1817 
516 
1569 
1706 
651 
1415 
1773 
1338 
1708 



434 


«35.0 


263 


-24 J 


426 


-16.0 


268 


-25.2 
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•15.3 


569 


-21.6 


414 


-14.0 


298 


-17.5 


403 


-20.9 


448 


•6.7 
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417 
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•6.0 


446 
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112 


-17.2 


417 


-6.6 
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■9.5 
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•1U 


412 


•14.6 
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•18.7 


694 


-17.3 


470 
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569 


•9.2 
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-19.6 


569 


•7.3 


362 


•0 1 


566 


-6.7 


447 


•6.3 
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535 
522 
409 
177 
500 
630 
533 
302 
580 
565 
624 
506 
567 
297 
312 
407 
602 
296 
569 
545 
563 
556 
621 
564 
363 
565 
736 
696 
363 
681 
347 
563 
479 
301 
1371 
696 
719 
329 
710 
545 



696 



-21.8 
-10.0 
•0.9 
•18.3 
>0.0 
-18.4 
-19.8 
-2:5 
-10.3 
•9.2 
-6.2 
-0.6 
. -0.4 
•18.1 
•8.0 
-8.1 
•16.8 
•10.6 
•20.6 
•21.2 
-3.6 
•3.8 
•8.0 
-7.0 
-0.6 
-1.5 
-13.6 
-26.1 
-1.0 
-6.0 
•5.0 
-2.7 
•3.4 
-0.9 
-27.0 
-3.5 
Z7 
•20.8 
-6.0 
•1.4 
•7.0 
2J2 



63.800 
1021900 
64.800 
101.000 
55.200 
50.000 
66.300 
90.200 
67.000 
62.100 
63.800 
65.000 
66.000 
55.500 
54.000 
62.400 
49.000 
346.600 
66.000 
62.500 
52.400 
66.600 
48.900 
43.800 
50.800 
51.400 
46.800 
50.000 
74.600 
50.200 
62.300 
61.500 
50.100 
53.900 
55.000 
57.000 
170.800 
56.900 
37.300 
54.100 
89.000 
50.60C 
50.300 
47.800 
56.200 
51.500 
90.500 
65.900 
67,300 
43.900 
90.80C 
50.000 
53.100 
50.400 
52.300 
48.000 
51.800 
74.400 
51.700 
41.600 
43.600 
74.500 
44.500 
77.500 
51.800 
58.900 
89.100 
17.400 
43.600 
42.500 
81,700 
43,000 
53.200 
62.300 
43.700 



MSN 



Dautue of r»i |,*« r prourni 923 



y CP** sosmw 



95 
96 

97 
96 
99 

100 
101 
102 
100 
104 
105 
106 
107 
106 
100 
110 
111 
113 
114 
115 
116 
117 
118 
120 
121 
122 
123 
124 
125 
126 
127 
126 
129 
IX 
131 
132 
133 
134 
135 
136 
137 
138 
139 
140 
141 
142 
143 
144 
145 
146 
147 
148 
149 
150 
151 
152 
153 
154 
155 
156 
157 
158 
156 
160 
161 
162 
164 
166 
167 
166 
169 
170 
171 
172 
173 



1119 
1731 
1033 
1406 
578 
2004 
1106 
462 
665 
773 
312 
1769 
1565 
1602 
1482 
778 
1728 
1191 
1296 
682 
1146 
1548 
1060 
1530 
638 
1572 
23 
621 
1296 
872 
1000 
1229 
1422 
1776 
1930 
660 



536 
756 
566 
565 
1148 
536 
623 
455 
630 
1182 
1117 
509 
720 
807 
503 
516 
700 
660 
165 
907 
610 
840 
577 



1271 
1161 
453 
1856 
1504 
I486 
1689 
311 
1366 
1429 
615 
2006 
2006 
1070 
1347 
541 
1645 
1269 
1507 
1722 
032 
1031 
1070 
1256 
1275 
1663 
1034 
1953 
1020 
1566 
1905 
1340 
1506 
1338 
1060 
600 
476 
010 



423 
712 
1433 
1474 
862 
921 
717 
311 
632 
409 
757 
537 
1019 
862 
1389 
1063 
823 
697 
707 
756 
1417 
915 
346 
1017 
566 
518 
1106 
578 
1481 
760 
236 
911 
448 
503 
204 
664 
183 
417 
620 
527 
771 
1462 
806 
565 
181 
563 
678 
541 
376 
056 
1314 



•0.0 
•2.0 
•11.4 
•6.1 
•23.8 
>0.0 
•10.1 
•28.5 
•20.2 
•17.0 
<-35.0 
•1.5 
-3.6 
•2.4 
-4.8 
-16.9 
-2.0 
•6.9 
•7.5 
•19.6 
-QS 
-4.1 
•11.1 
-4.3 
•15.4 
•3.8 
«-35.0 
-21.9 
-7.5 
-14.7 
-12.0 
•64 
•5.6 
-1.4 
-0.1 
•20.4 
-20.2 
•7.9 
-9.3 
* -29.7 
•0.6 
-4.6 
-4.8 
•2.4 
*'35.0 
-6.7 
•5.7 
-22.1 
>0.0 
>0.0 
•10.7 
-6.9 
•25.7 
-2.8 
•7.9 
-4.5 
-2.1 
-13.5 
-11.4 
>0.0 
-6.1 
•7.8 
•2.6 
-11.4 
>0.0 
-11.6 
-3.8 
-0.2 
-7.0 
-4.6 
-7.0 
>0.0 
•16.3 
•28.7 
•13.7 



53.800 
40.700 
51.600 
51.700 
25.000 
53.700 
47.900 
61.300 
37.300 
23.800 
26.100 
56.100 
42.500 
36.30C 
49.700 
55.5a 
43.50C 
44.500 
160.80C 
34.10C 
48,700 
36.500 
50.80C 
37.40C 
65.2a 
42.00C 

15.3a 

13.00C 
36.00C 
33.5a 
42.6a 

66.1a 

37.3a 

57.0a 

40.7a 
53.8a 
29.7a 
36.0OC 

16,8a 

2B.10C 
37.7a 

43.7a 
43.2a 

40.7a 

is.ea 
33.8a 
77.9a 

29.8a 

51.6a 
55.3a 

26.5a 

so.ea 
13,7a 
4o.sa 
11 7.0a 
33.0a 

62.100 
56.6a 

91.4a 

44.400 
162.4a 
65.0a 
37.8a 
54.6a 

40.0a 
13.7a 

36.4a 

51.7a 

164.9a 

50.4a 

44.7a 
53.5a 

7i,aa 
32.1 a 
10.3a 



174 
175 
177 
178 
179 
180 
181 
182 
164 
185 
186 
187 
186 
191 
192 
193 
194 
195 
196 
197 
106 
100 

2a 
201 

202 

203 

204 

205 

206 

207 

208 

210 

211 

213 

214 

215 

216 

217 

218 

219 

220 

221 

223 

22S 

226 

227 

228 

229 

230 

232 

234 

235 

236 

237 
236 
239 
240 
241 
242 
243 
244 
245 
246 
247 
248 
249 
250 
251 
252 
253 
254 
255 
256 
257 
256 



1364 
625 
1562 
1321 
1069 
1066 
411 
804 
1860 
1997 
279 
773 
1536 
1560 
1618 
1460 
1380 
784 
1227 
667 
2006 
1711 
872 
202 
736 
786 
1224 
439 
1904 
1895 
240 

17a 
002 

1067 
1340 
1561 
1565 
1159 
031 
713 
1479 
965 
934 
1612 
821 
1586 
1065 
1577 
1456 
1440 
1602 
618 
920 
952 
1611 
1489 
501 
1820 
1357 
711 
1855 
1189 
551 
1346 
460 
1733 
1974 
806 
874 
753 
095 
1690 
004 
506 
1517 



183 
393 
553 
710 
615 
567 
295 
730 
806 
1017 
1113 
206 
807 
674 
687 
555 
266 
632 
1165 
553 
661 
674 
424 
435 
253 
820 
589 
963 
571 
687 
1418 
409 
517 
664 
666 
495 
755 
393 
572 
177 
011 
927 
716 
1045 
411 
1463 
567 
890 
406 
849 
489 
1004 
1138 
1006 
541 
720 
448 
569 
656 
1182 
621 
474 
456 
604 
448 
451 
786 
302 
553 



450 
679 
1006 
464 
820 



•6.7 
•15.7 
-3.6 
-7.2 
-10 4 
•0.5 
-32.1 
-16.2 
-0.6 
>0.0 
«-35.0 
-17.0 
-4.2 
-3.0 
•0.0 
•5.0 
•64 
-16.7 
-6.4 
-20.1 
>0.0 
-2.2 
-14.7 
<-35.0 
-16.0 
-16.7 
•8.5 
-30.9 
>0.0 
■0.3 
<-35.0 
•2.3 
•14.1 
-10.4 
-7.0 
•3.5 
•3.6 
-9.3 
-13.5 
•18.7 
-4.9 
•128 
-13.5 
-1.0 
-15.6 
-36 
•10.8 
-3.7 
-5.2 
•5.5 
-2.4 
-22.0 
•13.7 
•13.1 
•3.2 
-4.8 
-27.7 
-0.9 
-6.8 
-18.7 
•0.6 
•6.9 
-25.1 
-6.8 
-29.3 
•1.9 
>0.0 
•16.1 
-14.6 
•17.6 
-12.1 
-2.4 
-12.1 
-27.4 
-44 



1 62.0a 

60.300 
52.60 
43.0a 
46.3a 

51 .6a 

01 300 

42.0a 
* 34.5a 

29.8a 
26.3a 
90.8a 
36.4a 

44.9a 

44 200 

52.4a 
ioi.ea 

47.3a 
23.7a 
52.6a 

44.5a 
44.9a 

65.0a 
63.7a 
' 107.8a 
37,4a 

50.0a 
31.1a 
51 .3a 
44.2a 
i5.ea 
57.0a 
55.4a 

44.4a 

45.2a 

57.3a 

4o,7a 

69.3a 

51.2a 
170.5a 
33.9a 

33.3a 
42.7a 

28.ea 

66.6a 

13.6a 
51.6a 

34.8a 
57,3a 
36.5a 

57.9a 
30.3a 

25,4a 

30.2a 
53.5a 

42.5a 

62.1a 
51.4a 

45.8a 
23 800 
46.0a 
59.3a 
61 .COO 

49.1a 
62.1a 
6i.8a 
39.2a 

69.5a 
52.5a 
36.5a 

61. oa 

44.6a 

30.2a 

60.4a 

37.ea 



924 



'""'•WW, 



MSN X 

256 179$ 

260 661 

261 172S 

262 486 

263 1063 

265 1390 

266 510 

267 660 
266 430 

269 1044 

270 2019 

271 657 

272 695 

274 1292 

275 1350 

276 1670 

277 668 

278 961 

279 879 
281 1646 
262 1505 

283 1313 

284 1314 

285 1332 

286 1277 
288 1391 

1147 



Y cpkoi soaiw 



MSN 



* CPKpl SOSMW 



1361 
679 
1127 
172 
673 
437 
1036 
961 
606 
653 
422 
968 
712 
690 

1060 
536 
718 
570 

1064 
525 

1147 



290 
291 



925 
787 



292 1462 

293 531 

294 660 

295 1162 



216 

297 1377 

299 913 

300 2012 

301 702 

302 494 

303 403 

304 1843 

305 1049 

306 1606 

307 1219 
306 1627 

309 1524 

310 1769 

311 1606 

312 266 

313 1902 

314 1316 

315 1341 
318 1104 

320 1460 

321 650 

322 1454 

323 670 

324 656 

325 1521 

326 1567 

327 1368 

328 448 

330 1606 

331 1566 

332 531 

333 764 

334 1059 

335 1593 

336 1616 

338 1854 

339 1265 

340 561 

341 1497 

343 1351 

344 1613 



652 
824 
579 
511 
1476 
618 
449 
698 
609 
614 
979 
1523 
667 
178 
1280 
1006 
1565 
583 
999 
916 
755 
682 
1028 
1451 
1408 
1366 
1395 
523 
1053 
1456 
603 
1494 
626 
101 
675 
677 
409 
1291 
751 
697 
471 
1156 
407 
303 



1004 



565 
1047 
266 
549 



-1.1 
-20.4 
-2.0 
-28.0 
-10.9 
-6.3 
-27J 
-20.4 
-31.0 
-11-2 
>0.0 
-15.0 
-14.2 
-7.6 
-6.6 
-2.6 
-19.4 
-13.0 
•14.5 
-0.7 
-4.6 
-7.3 
-7.3 
-7.1 
-7.8 
-6.3 
-9.5 
-13.6 
-16.6 
-5.1 
•26.3 
-14.9 
-9.3 
<-35.0 
-6.5 
-13.9 
>0.0 
•19.0 
-28.1 
-32.6 
-0.7 
-11.1 
-3.3 
-6.5 
-3.0 
-4.4 
-1.5 
-3.3 

os.o 

-0.3 
-7.3 
-7.0 
-10.1 
-4.9 
-15.1 
-5.3 
-20.0 
-20.6 
-4.4 
-3.6 
-6.3 
-30.0 
-3J 
-3.6 
-26.3 
-16.7 
•10.9 
-3.5 
•3.2 
-0.6 
-6.0 
-23.6 
-4.7 
-6.6 
-0.9 



31.800 
17.700 
44.600 
25.600 
177.400 
45.000 
63.400 
29.000 
31.800 
46.600 
36,300 
65.200 
31.700 
42.900 



27.100 
53.700 
42.600 
51.300 
27.300 
54.600 
25,100 
37.400 
67.200 
46.100 
37.600 
50.700 
55.900 
13.900 
37.600 
62.000 
43.600 
48.700 
38.000 
31.300 
1Z400 
45.300 
168.200 
20.400 
30.100 
10.300 
49.600 
30.900 
33.700 
40.700 
34.700 
29.400 
14.700 
16.100 
17.600 
16.600 
54.900 
28.500 
14.400 
49.100 
13.300 
47.700 
420.500 
44.000 
44.700 
67.000 
20.100 
40.900 
43.700 
59.600 
24.700 
67.300 
86.600 
49.400 
30.300 
34.900 
50.300 
28.700 
102^00 
52.800 



345 1006 

346 1095 

347 625 
346 361 

349 110 

350 521 

351 912 

352 1574 

353 961 

354 706 

355 1450 

356 1374 

357 474 
356 798 
356 754 

360 1364 

361 1713 

362 1161 



363 
364 
365 
366 



914 
412 
741 
678 



367 1560 

368 963 



370 
371 



434 
639 
1567 



372 1675 

373 1351 

374 1506 

375 1823 

376 254 

377 1409 

378 621 

379 1017 

381 953 

382 856 

383 12S2 
364 1699 

385 1042 

386 1490 

387 1554 
366 1193 

389 1374 

390 1456 

391 718 

392 1799 

393 1482 
3*4 1227 

395 1530 

396 1410 

397 912 

399 1465 

400 1473 

401 1029 

403 1516 

404 1495 

405 1525 

406 723 

409 650 

410 1501 

411 936 

412 350 

413 1033 

415 737 

416 1576 

417 646 
416 1695 

419 725 

420 1289 

421 1171 



578 
640 
728 
983 
1343 
1130 
619 
530 
912 
762 
830 

1152 
897 
346 
338 

1066 
769 
859 

1156 

435 

466 

1503 

935 

520 

441 

610 

660 

762 
1059 

715 

532 

417 

583 

494 

595 

598 
674 



•1i.9 
•10 J 
•21.7 
-35.3 
<-35.0 
-26.7 
-13.6 
•3.7 
-12.9 
-18.9 
•5.3 
-6.5 
-28.7 
-16.3 
-17 J 
-6.4 
-2.1 
-9.3 
•13.8 
•32.0 
•17.9 
-14.6 
•3.9 
-12.4 
-31.0 
-21.2 
•3.6 
-0.5 
-6.8 



422 
423 
424 



599 
929 
739 



425 1490 



1518 
493 
583 
603 
404 
902 
969 
690 
732 
758 
1461 
577 
755 
256 
1063 
450 
1140 
754 
554 
1092 
252 
663 
478 
1057 
1120 
538 
425 
606 
496 
462 
770 
1041 
912 
162 
856 
625 
965 



-4.6 
-0.9 
<-35.0 
-6.1 
-21.8 
-11.7 
-13.1 
-15.0 
-8.1 
-2.3 
-11.2 
-4.7 
-4.0 
-6.9 
-6.5 
•5.2 
-18.5 
-1.1 
-4.8 
-8 4 
-4.3 
-6.0 
•13.9 
-5.0 
-4.9 
-11.5 
-44 
-4.7 
•4.3 
-19.4 
•20.8 
-4.6 
-134 
-35.9 
-11.4 
-18.0 
-3.7 
-21.0 
-2.3 
•18.3 
-7.7 
-9.1 
-22.6 
-13.6 
•17.9 
-4.7 



50.800 
46.800 
42.000 
31.100 
18.300 
25.700 
48.100 
54.300 
33,900 
40.400 
37.300 
24.900 
30.600 
77.800 
79.400 
27.900 
40,100 
36.100 
24.800 
63.700 
56.200 
13.000 
33,000 
55,200 
63,000 
46,700 
36.100 
40,400 
28.300 
42.700 
54.200 
65,900 
50,400 
57.500 
49.600 
49.400 
44.900 
105.300 
12,500 
57.500 
50.400 
49.100 
67,700 
34,300 
31.700 
44.000 
41,900 
40,600 
14,400 
50.600 
40,800 
106.400 
28.100 
61,900 
25.300 
40.600 
52.500 
27.100 
106,000 
45.500 
59,000 
28.300 
26.000 
53.700 
64.900 
48.900 
57.300 
56.600 
40.000 
28.900 
33.900 
193,700 
36.200 
47,700 
31.600 




427 
428 
429 
430 
431 
432 
434 
435 
436 
437 
436 
439 
440 
441 
443 
446 
447 
448 
449 
450 
451 
452 
453 
454 
456 
457 
459 
460 
461 
462 
463 
464 
465 
466 
468 
469 
470 
471 
472 
473 
474 
475 
476 
477 
478 
479 
460 
462 
463 
465 
486 
487 



489 

490 

491 

492 

493 

494 

495 

496 

497 

499 

500 

501 

502 

503 

504 

505 

506 

507 

506 

509 

510 



1 

810 
1565 
1259 
1253 
734 
463 
516 
1020 
1122 
1870 
435 
86 
1740 
589 
743 
801 
1050 
1245 
1576 
1818 
1084 
1945 
1652 
1403 
1394 
905 
1038 
1598 
1528 
1096 
649 
1814 
1388 
1194 
577 
1140 
1797 
1293 
618 
2009 
1205 
1035 
160 
469 
599 
1009 
1216 
616 
683 
1606 
478 
1025 
1045 
1609 
775 
692 
1100 
1760 
662 
470 
494 
980 
1414 
1234 
1246 
624 
1246 
1115 
1189 
1578 
787 
679 
1153 
1730 



704 
843 
303 
847 
562 
1426 
433 
1041 
1170 
196 
673 
1102 
847 
544 
1571 
335 
666 
926 
1296 
1516 
1021 
440 
802 



500 
718 
436 
561 
294 
863 
1137 
1125 
1072 
481 
1064 
467 
686 
524 
1133 
655 
299 
215 
788 
155 
1370 
662 
540 
235 
346 
673 
1013 

607 
1166 
301 
1209 
178 
964 
776 
247 
1256 
1436 
852 
546 
1072 
659 
792 
1134 
1407 
391 
402 
250 
552 . 
619 
1006 



-7 
-16.0 
•3.9 
-6.0 
-6.1 
•18.1 
•285 
■26.9 
11.6 
•98 
•0.5 
•31 .0 
<35.0 
-1.8 
228 
•17.8 
•16.2 
•11.1 
•8.2 
-3.7 
-0.9 
•10.3 
>0.0 
•2.8 
-6.1 
-6.3 
•14.0 
-11.3 
•3 4 
-4.3 
•10.2 
•15.2 
-0.9 
•6.3 
-6.9 
•23.9 
-9.6 
-1.1 
•7.6 
-21.9 
>0.0 
-8.7 
•11 4 
<-35.0 
•28.9 
•22.8 
•11.8 
-6.6 
-15.9 
-19.3 
3.3 
•286 
•11.5 
•11.2 
3 3 
-17.0 
-19.3 
-10.2 
-1.6 
•14.5 
•28.9 
•28 1 
•12.5 
-6.0 
-63 
-6.2 
-157 
-6.2 
•9.9 
-8.9 
-3.7 
•16.6 
■12.5 
-94 
-2.0 



36** 

26 8* 

24 J* 
147.6* 
4S.3E 
26.7BC 
36 

10.BQC 
60.1CC 
45JDC 
33.30C 
19.8QC 
12.60C 

29 ex 

63.10C 

3a act 

34.600 
56 900 
42.600 
63.S0C 
50.500 

91 4QC 

35.9CC 
2543C 
2S.B0C 
27.800 
58.700 
27.300 
60.100 
34.900 
54.800 
25.500 
46.000 
89.900 
131.300 
39 20C 
207.8X 
17.400 
45.600 
53.500 
117.400 
77.800 
44.000 
30.000 
49.300 
48.800 
23.700 
89 200 
20.100 
169.300 
31.80 
39.700 
110.7W 
21^0 
152DD 
36400 
51160 
278DD 
4570O 
3900C 
2550C 
16306 

ea7C8 

6600D 
10RCC8 



800 
[1099 
h686 
948 
481 
•1334 
868 
798 
622 
632 
M332 
603 
>H90 
479 
766 
747 
[■1170 
1502 
1726 
507 
670 
11347 
H513 
306 
11651 
1463 
It 806 
V 625 
1164 
803 
r 1259 
856 
803 
1162 
128 
t l355 
>. 585 
[1366 
r. 882 
►1125 
706 
1477 
080 
700 
1Q28 

786 

777 

860 
[1518 
H212 

780 

618 
fH42 

532 

771 
L1066 
,822 

914 
1064 
11824 
f.1392 
i 882 
£1487 

758 

687 

»0 



4*1* 



642 
L«17 
85 
1014 

t732 
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Diubase of rat liver 



proteins 



925 
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511 » 484 

512 1009 533 

513 1606 1034 

514 048 636 

515 4«1 SO 

516 1334 1044 



517 
51S 
519 



796 
622 
632 
521 1332 



522 
523 
524 



1021 
779 
670 
165 
830 

603 1104 



526 

527 1170 



1100 309 

479 1226 

766 1066 

747 1016 



528 1502 
530 1726 
532 
5X3 



231 
542 
620 
507 1011 
489 



870 

534 1347 1085 
1513 346 
306 
1851 
539 1463 
909 
625 



•16.0 
•10.2 
-23 
-13.2 
-28.5 
-7.1 
-14.8 
-16.3 
-15.7 
-21.5 
•7.1 
-22.6 
4.9 
-26.6 
-17.2 
-17.7 
-9.2 
-4.6 
-2.0 
-27.4 
-14.7 
-6.9 
-4.5 

654 <-35.0 
-0.7 



$40 
541 



542 1164 

543 603 

544 1259 
545 



962 
561 
289 
196 
655 
1143 



494 

405 
410 
975 



856 1526 

546 803 1071 

547 1162 274 

548 128 1321 

549 1355 1122 
5» 595 866 

552 1369 

553 992 
555 1125 
566 705 

557 1477 1030 

558 980 583 

559 700 1109 

560 1028 
562 896 

564 789 

565 777 

566 980 

567 1519 
1212 

760 
618 



570 
571 

573 1142 

574 532 

575 771 

576 1068 
822 
914 



577 
57B 

5* 1064 

560 1524 

581 1392 

582 962 
*4 1487 
*5 758 
566 
567 

588 ,888 

5 642 

5*> 1317 



621 
794 
1446 
766 
328 
611 
661 
594 
956 
771 
787 
250 
534 
734 
754 
794 
714 
783 
686 
672 
731 

667 1152 
930 523 



-5.1 
-13.9 
•21.7 
-9.2 
•16.2 
-6.0 
-15.0 
-16.2 
-9.3 
<-35.0 
-6.8 
-23.0 
-6.6 
-12.2 
-9.8 
-18.9 
-4.9 
•12.5 
•19.1 
-11.5 
-14.1 
-16.6 
-16.9 
-12.5 
-4.4 
-6.6 
-17.4 
•21.9 
-9.6 
•26.2 
-17.1 
-10.8 
-15.7 
-13.8 
-10.8 
-4.4 
-6.3 
-12.4 
-4.6 
-17.4 
-19.5 
-13.5 
-0.4 
-21.1 
73 



774 
485 

5^ — ' 519 
£ 65 1548 <-35.0 
*8 1014 614 .11.7 
176 
478 



? 732 
*< 1627 



1009 1426 



-18.1 
-3.0 
-11.8 



58.400 
54.100 
29J200 
47,100 
53.400 
28.800 
29.700 
39.600 
45.100 
189.000 
37.300 
26.600 
86.800 
22.300 
28.000 
29.800 
119.600 
53.400 
48.000 
30.000 
57.900 
27.300 
77.800 
46.000 
44.100 
31.100 
52.000 
93.100 
146.200 
45.900 
25.200 
12^00 
27.800 
96.400 
19.000 
25.900 
35.800 
57.500 
67.600 
66.900 
31.400 
29.300 
50.400 
26.400 
48.000 
36.900 
14.900 
40.200 
61.900 
48.600 
45.600 
49.700 
32.100 
40.000 
39,300 
109,200 
54.100 
41,800 
40.800 
36.900 
42,800 
39.400 
44.200 
45.000 
41.900 
24.900 
56.000 
39.900 
56.300 
55300 
11.500 
46.400 
172,300 
56,000 
15.500 



506 619 260 -21.9 

507 1176 461 -9i 

508 1465 1044 -50 
506 741 1188 -17.9 

600 907 402 -14.0 

601 687 656 -19.5 

602 712 1138 -18.7 

603 896 181 -14.1 

604 783 1461 -16.7 
606 736 223 -ie.0 

606 629 273 -21.6 

607 1064 266 -10.8 
606 883 503 -14.5 
606 2012 610 >00 
610 1255 903 * i 

612 1103 391 -10 1 

613 778 265 -16.9 

614 -824 516 -15 7 

615 1095 195 -10 3 

616 1759 478 -1 6 

617 994 372 -12.1 

618 751 374 -17 6 

619 1429 518 -5 7 

620 1050 520 -11.1 

621 923 1105 -13.7 

622 1462 622 -5 1 

623 759 225 -17 4 

624 758 1 038 -17 4 

625 1438 606 -5.5 

626 1096 1 069 -10 2 

627 942 548 -13.3 

628 809 621 -16 0 

629 899 979 -14 1 

630 1135 1321 -9.6 

631 979 615 -12 5 

632 1542 1076 -4 i 

633 1345 814 -6 9 
6J4 409 950 -32.2 

635 1165 704 -9 2 

636 774 604 -17.0 

637 1263 524 -8 0 

638 952 411 -13 1 

639 1 717 575 -2 1 

640 994 292 -12.1 

641 165 1224 <-35 0 

642 803 251 -16.2 

643 719 296 -18.5 

644 1100 294 -10.2 

645 534 1263 -26 1 

646 1153 1038 -9 4 

648 1246 204 -6.2 

649 14 1406 <-35.0 

650 1713 1049 -2 1 

651 1986 1183 >00 

652 1378 816 -6 5 

653 1442 1165 -5 5 

654 650 606 -20 8 
«5 1111 551 -100 

656 1095 861 -10.3 

657 1524 540 -4 4 

658 1777 860 -1 4 
656 391 584 -33 4 

660 977 565 -12.5 

661 658 166 -20.5 

662 732 312 -18 1 

663 1 787 567 -1.2 
» 888 268 -14 4 
665 889 775 -14.3 

715 221 -18.6 
«*7 781 227 -16 8 
€« 646 165 -21.0 
660 1116 353 -99 
670 1382 643 -6 4 
*71 547 789 -25.3 
673 984 746 -12 4 



100.500 
60.700 
28.800 
23.600 
68.000 
45.800 
25.400 
16S.200 
14.400 
125.300 
96,700 
94.000 
56.700 
48.700 
34.200 
69.600 
102.000 
55.400 
149.100 
50,000 
72.900 
72.400 
55.300 
55.200 
26.600 
47.900 
124,000 
29.000 
46.900 
27.200 
53.000 
48.000 
31.300 
19.100 
48.300 
27.600 
38.000 
32.400 
43.300 
49.000 
54.800 
66.700 
51.000 
92.000 
22.400 
108.900 
90.700 
91.400 
21.000 
29.000 
140.000 
16.200 
26.600 
23.800 
38.000 
24.400 
36.400 
52.700 
36.000 
53.600 
36.000 
50.400 
51.700 
187,500 
86.100 
51.500 
100.900 
39.800 
126.300 
122.400 
189,100 
76.300 
46.600 
39.200 
41.200 



Y CPKol SOSMW 



674 1661 448 -2.7 

675 1523 562 .4 4 

676 708 642 -18 8 

677 919 615 -13 7 

678 1085 551 -10 5 

679 600 923 -22 7 

680 1237 1004 -6 3 

681 1103 283 -10 1 
«2 1406 477 -6 1 
«3 1506 249 -3 4 
664 555 699 -24.8 

685 1167 1313 -9 2 

686 1932 - 790 0 0 

687 1545 619 -4 1 

688 1456 764 .5^ 

689 1011 953 -ivB 

690 1995 270 >O0 

691 612 886 -16 0 

692 1154 1461 .94 

693 1993 819 >0 0 

694 1628 656 .30 

695 928 254 - 136 

696 1 854 715 -06 

697 1997 345 >0 0 
696 957 563 -13 0 
699 1540 730 .4.2 

702 577 900 -23 8 

703 1610 562 .32 

705 1278 571 .7*8 

706 1 641 704 -0 7 

707 1 018 1 386 .11.7 

709 1074 1145 -10 7 

710 293 889 <-35 0 

712 720 412 -18 5 

713 1 386 841 -64 

714 1328 263 -7 1 

715 696 . 433 -19 .1 

716 701 481 -19 0 

717 1875 699 -0 5 

718 575 702 -23 9 

719 1216 204 -8 6 

721 1069 464 -10 8 

722 1272 506 -7.9 

723 956 822 -13 0 

724 763 395 -17.3 

725 720 916 -18.5 

726 1476 415 -49 

727 1846 473 -0.7 

728 510 783 -27.3 

729 1217 1126 -8 6 

730 1858 724 -0 6 

731 665 765 -20.2 

733 1321 312 -7.2 

734 719 427 -18.5 

735 1101 473 -10 2 

736 1359 569 -6.7 

738 696 220 -19.2 

739 687 409 -19.5 

740 1205 256 -6.7 

741 995 563 -12 1 

742 898 596 -14.1 

743 881 181 -14.5 

744 1 951 686 >0.0 

745 726 168 -18.3 

746 999 643 -12.0 

748 1 82 1503 <-35.0 

749 2005 649 >0 .0 

750 1448 575 -5 4 

751 792 266 -16.5 

752 469 296 -28.9 

754 664 254 -20.3 

755 1195 184 4.6 

756 1821 1113 -0.9 

757 909 246 -13.9 
760 790 133 -16.5 



62.100 
51.900 
46.700 
48.300 
52.700 
33 400 
30 300 
95.100 
50.100 
109.800 
43.500 
19.300 
39.100 
48.100 
40.300 
32.300 
100.200 
34.900 
14.400 
37.800 
45.900 
107.000 
42.700 
78.000 
. 51.800 
42.000 
34.400 
51.900 
51,200 
43.300 
16.900 
25.100 
34.800 
66.600 
36.800 
103,100 
63.900 
56.700 
43:600 
43.400 
140.400 
60.400 
56,400 
37.700 
69.100 
33,700 
66.200 
59.400 
39.400 
25.800 
42.300 
40.300 
65.900 
64 600 
59.500 
51.400 
127.600 
67.000 
106.200 
51.900 
49.500 
165.900 
44.200 
163.600 
46 600 
13.000 
46.300 
51.000 
101.900 
90.600 
107.000 
161.000 
26.300 
111.000 
264 900 



926 



MSN 



i ttmL 



Y CPKDl SOSyfW 



MSN 



Y CPKdI SOSMW 



761 
763 
764 

765 

766 

767 

766 

769 

770 

771 

773 

775 

776 

777 

776 

779 

780 

764 

785 

786 

787 

790 

791 

792 

793 

794 

796 

797 

796 

799 

800 

801 



1396 
1416 

2oao 

651 
10S2 
1966 
1330 
1970 
657 
1337 
1576 
969 
1436 
1539 
850 
700 
1052 
1413 

1364 

1622 
893 
616 
451 
777 

1536 

1461 
366 

1126 
633 

1420 

1759 
624 



733 
1065 



475 
1149 



613 
617 
974 
502 
824 
706 
456 
434 
411 

1136 
529 
885 

635 



803 
804 
805 
806 
807 
806 
809 
610 
811 
812 
813 
614 
815 
616 
817 
816 
619 
820 
821 
822 
823 
624 
825 
626 
827 
628 
630 
631 
832 
633 
634 
837 
838 
639 
640 
641 
642 
643 
644 
645 
646 
647 



1775 
573 
203 
980 
902 
625 
1851 
440 
1356 
651 
745 
2026 
1086 
629 
1376 
1771 
1045 
964 
1712 
1256 
1517 
1442 
1240 
1309 
2012 
637 
1342 
562 
1073 
481 
501 
751 
635 
1494 
1952 
1565 
571 
1325 
1727 
630 
2016 
673 



1429 
377 
1543 
607 
546 
?12 
437 
563 
279 
865 
547 
1466 
196 
404 
1039 
306 
627 
1015 
573 
249 
393 
1246 
610 
645 
313 
1177 
790 
263 
362 
279 
205 
6S4 
449 
513 
1014 
706 
1405 
756 
626 
1039 
620 
561 
746 
633 
456 
301 
1060 
1312 
649 
301 
679 
905 
1200 



-6.2 
•5.9 
>0.0 
-20.8 
-11.1 
>0.0 
-7.1 
>0.0 
•15.0 
-7.0 
-3.7 
-12.6 
•5.5 
•4.2 
-15.1 
-19.1 
-11.1 
-6.0 
-6.7 
•0.9 
-14.3 
-22.0 
-29.8 
•16.9 
-4.2 
-5.1 
-33.6 
-9.6 
-13.5 
•59 
-1.6 
-21.7 
-14.2 
-1 4 
-24.0 
<-35.0 
•12.5 
-14.1 
■21.7 
-0.7 
•30.6 
-6.8 
•15.1 
•17.8 
>0.0 
•104 
-21.6 
•6.5 
•14 
-11.2 
•12.4 
-2.2 

-4 4 

•5.5 

-6.3 

-7.4 
>0.0 
-13 4 

-7.0 
•24.5 
-10.7 
-28.5 
-27.6 
-17.6 
-21.3 
-4.7 
>0.0 
•3.6 
-24.1 
-7.2 
•2.0 
•21.5 
>0.0 
■19.9 



41.800 
27.300 
51.400 
59.300 
25.000 
59.900 
44.300 
46.500 
46J200 
31.500 
56.700 
37.600 
43.100 
61.000 
63.600 
66.600 
25.500 
54.400 
35.000 
37.100 
69.500 
35.100 
15.400 
72.000 
11.700 
38.300 
53.100 
133.700 
63.400 
49.800 
96.500 
35,600 
53.000 
14,200 
148.400 
57.400 
29.000 
87.200 
37.500 
29.900 
51.100 
109.7D0 
69.400 
21.600 
36.200 
46.500 
65.700 
24.000 
39.100 
103.100 
74.600 
96.700 
139.200 
46.000 
62.000 
55.600 
29.900 
43.100 
16.200 
40.700 
37.500 
29.000 
37.800 
50.500 
41.100 
37.200 
60.900 
69.300 
27.500 
19.400 
46.300 
89.200 
44.600 
34.200 
23,200 



«46 
649 

650 
651 
652 
655 
656 
657 
658 
659 
860 



862 

864 

865 

866 

866 

866 

670 

871 

872 

673 

674 

875 

676 

677 

878 

879 

880 

881 

863 

884 

865 

866 

887 

886 

689 

890 

891 

892 

894 

895 

896 

897 

896 

899 

900 

901 

903 

904 

905 

907 



1863 
1166 
1535 
1035 
634 
499 
1063 
887 
1446 
706 
1070 
472 
674 
1307 
645 
827 
665 
1807 
1323 
1226 
1904 
556 
1540 
1566 
1196 
1076 
1161 
647 
1756 
1543 
1432 
922 
1103 
1501 
796 
636 
951 
717 
1123 
891 
1245 
1962 
1322 
420 
662 
845 
624 
931 
799 
765 
775 



271 
523 
1024 
626 
542 
220 
194 
890 
639 
311 
1066 
347 
480 
499 



1004 
494 

402 
763 
1031 
346 
647 
756 
777 
351 
720 
1111 
757 
594 
276 
890 
689 
414 
607 
1103 
634 
759 
546 
229 
413 
234 
346 
626 
570 
428 
243 
703 
1094 
229 
520 



910 

911 

913 

914 

916 

917 

919 

920 

921 

923 

924 

925 

926 

927 

926 

929 

931 

932 

933 

934 

036 

637 



826 
681 
1544 
1606 
1237 
1442 
1260 
764 
1133 
1123 
829 
1131 
1441 
679 
1467 
1062 
1231 
1609 
610 
965 
047 
665 
1421 



624 

1303 

1544 

301 
387 
668 
749 
367 

1541 

1123 
380 
242 
316 
874 
219 

1191 
775 
616 
670 
900 
520 
462 

843 

1056 



-0.6 
-9.2 
-4.2 
•11.4 
•15.5 
-27.8 
•10.9 
-14.4 
•5.4 
•18.6 
-10.7 
-28.6 
-19.9 
-7.4 
•21.0 
•15.6 
•19.5 
-1.0 
-7.2 
-6.4 
-0.3 
•24.8 
-4.2 
-3.6 
-6.6 
•10.6 
•9.3 
-20.9 
•1.6 
-4.1 
•5.7 
-13.7 
•10.1 
•4.6 
•16.3 
-21.3 
-13.1 
-18.6 
•9.6 
-14.3 
-6.2 
>0.0 
•7.2 
-31.4 
-20.3 
-15.3 
-21.7 
-13.5 
-16.3 
-17.2 
-17.0 
•14 4 
-15.6 
-19.7 
-4.1 
-33 
-6.3 
•5.5 
-8.0 
-17.3 
-9.7 
•9.8 
•15.6 
-9.7 
•55 
-19.7 
-4.6 
-10.5 
•64 
•3.3 
-16.0 
-12.8 
•13.2 
-14.8 
-5.9 



99.500 
54.000 
29.600 
37.500 
53.400 
127.100 
150.500 
34.600 
46.900 
86.200 
28.000 
77.600 
56.800 
57,000 
34.600 
30.300 
57.400 
66.000 
39,400 
29.300 
77.700 
46.400 
40.700 
39.700 
76.800 
42.500 
26.400 
40.700 
49.700 
97.100 
34,800 
44,100 
66,400 
46.900 
26.600 
47.200 
40,600 
' 52.900 
121.200 
66.400 
117.800 
77.700 
47.700 
51.300 
64.500 
113.000 
43.400 
27.000 
121.000 
55.200 
34.800 
37.600 
19.700 
11.700 
89.100 
70.400 
44.100 
41.100 
73.700 
11.700 
25,900 
71.500 
113.200 
64.300 
35.400 
128.200 
23.500 
39.800 
38.000 
45.100 
34,400 
55.100 
60,600 
36.800 
26.400 




«39 1197 

941 1765 

042 602 
312 

044 993 

945 1300 

046 630 

W7 187 

046 1380 

949 1766 

W) 1036 

051 860 

9S2 957 

954 503 

955 1938 
957 1010 

956 768 
960 596 
«1 557 
962 667 
K3 564 

964 069 

965 671 

966 1204 

967 910 
966 609 

969 1265 

970 622 



971 
972 
974 
975 



976 
403 
279 
644 



976 1124 

977 994 

978 1612 

979 749 

960 1064 

961 1197 
963 1762 



964 


1344 


965 


1024 


967 


739 


968 


816 


990 


785 


991 


1159 


992 


1090 


993 


1030 


994 


847 


995 


902 


996 


688 


997 


1815 


996 


1205 




617 


1000 


968 


1001 


970 


1002 


1736 


1003 


643 


1006 


822 


1007 


875 


1009 


291 


1010 


1386 


1011 


459 


1012 


679 


1013 


1818 


1014 


1032 


1015 


1629 


1016 


1311 


1017 


1722 


1018 


1015 


1020 


1574 


1021 


781 


1022 


1129 


1023 


812 



1024 785 

1025 1290 



627 
685 

472 
496 
491 
269 
423 
736 
344 
665 
163 
152 
701 
547 
712 
816 
174 
419 
409 
320 
334 
1156 
256 
796 
154 
1046 
206 
232 
437 
567 
495 
961 
295 
664 
642 
1141 
642 
911 
1506 
317 
1105 
1159 
555 
361 
317 
928 
701 
811 
461 
847 
579 
504 
289 
290 
771 
478 
1184 
467 
279 

644 < 

745 

541 

661 
1128 
634 
994 

1134 

424 

743 
1219 

464 
63 

317 

446 

739 



-aj 

-1.5 
•22.7 
<-3S.0 
•121 
•7.5 
•21 6 
<-35.0 
•65 
•1.5 
-11.3 
•14.9 
•13.0 
•27.6 
>0.0 
-11.6 
-17,2 
-23.0 
•24 6 
-14 4 
24.5 
-12.8 
-20.0 
•6.7 
-13.9 
•22.3 
-7.7 
•156 
-12.6 
-326 
<-35.0 
•15.3 
-9.6 
•12.1 
-3.2 
-17.7 
-10.6 
-6.6 
-1.6 
-6.9 
-11.5 
•17.9 
-15.9 
-16.7 
-9.3 
-10 4 
•11.5 
-15.2 
•14.1 
•14 4 
-0.9 
-8.7 
-22 0 
•128 
•12 7 
•1.9 
-21.1 
-15.8 
-14 6 
•35.0 
•64 
•29 4 
•19.7 
-0.9 
•11 4 
•3.0 
7 4 
-20 
11.7 
•3.7 
16.6 
-9.7 
15.9 
16.7 
7.7 



V** 

59.6QC 
57.1* 
57.7tt 
1O0.3<X 
«5.1qc 

4 '.GCC 
76.2tfc 

4540G 

213.CC& 
43.4CD 
53.030 

<290C 
37.QQO 

,7 «,SCC 
5570C 
67.1CC 
B3.9CC 
80.50C 
24.600 

106.6CC 
38.700 

210.30C 
28.700 

138.9CC 

11S.30C 
63 4CC 
S1.6CC 

57.4a 

31.20C 
91.100 
45 400 
46.700 
25.3C0 
46.700 
33.900 
12.800 
64.700 
26.600 
24 600 
52 4CC 
74.900 
84 500 
33.300 

43.400 

38.200 

60.700 

36.600 

50.700 

56.500 

93100 

92.700 

40.000 

56.900 

23.700 

56.100 

06 400 

46 600 

41.300 

53.500 

45600 

25.600 

47.200 

30.700 

25.500 
65.000 
41.300 
22.500 

56.400 

591.3* 
64.6C0 
62.400 
41.9* 



405 
1296 
656 
1284 
966 
1547 
1381 
1525 
1128 
1226 
1761 
541 
816 
1036 
1439 
1540 
1576 
1069 
049 
426 
1563 
779 
1613 
1380 
264 
1261 
393 
1617 
1245 
1256 
705 
1161 
529 
506 
1896 
873 
1768 
836 
1663 
826 
971 
1697 
1157 
620 
1667 
2019 
1546 
1545 
61 
1054 
568 
1050 
457 
1864 
1714 
1717 
1976 
547 
1348 
1385 
1078 
975 
1202 
1022 
1906 
1512 
1114 
1464 
1046 
1122 
1722 
L» 1006 
1630 
764 
1068 



DtubtM of nt kr.tr p, 0 i«.w 
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Y CPW SOSMW 



MM 



Y CPKtf SOSMW 



MSN 



1000 
1081 
1032 
1033 
IBM 
1005 
1036 
1039 
1040 
1011 
1044 
1045 
1047 
1048 
1040 
1060 
1061 
1062 
10S3 
1064 
1065 
1066 
1066 
1060 
1061 
1062 
1064 
1065 
1066 
1067 
1066 
1060 
1071 
1073 
1075 
1076 
1078 
1081 
1063 
1085 
1090 
1082 
1083 
1004 
'005 

-tooe 

1 0ft 

1101 
M02 
'103 

105 
M06 

107 

noe 
1111 
112 
115 
116 
'117 
118 
1110 
■120 
121 
122 
123 
'125 
126 
128 
133 
130 
147 
148 



405 
1296 

656 
1284 



552 



1547 
1381 
1525 
1128 
1226 
1761 
541 
616 
1036 
1439 
1540 
1576 
1080 
049 
426 
1563 
779 
1613 
1360 
264 
1261 
393 
1817 
1245 
1256 
705 
1181 
529 
508 
1698 
873 
1768 
636 
1663 
826 
971 
1697 
1157 
620 
1867 
2019 
1S46 
1545 
61 
1954 
588 
1050 
457 
1664 
1714 
1717 
1976 
547 
1348 
1365 
1078 
975 
1202 
1022 
1905 
1512 
1114 
1464 
1046 
1122 
1722 
1098 
1630 
764 
1968 



547 
226 
622 
4Q3 
551 
496 
645 
274 
262 
639 
910 
465 
407 
250 
635 
411 
1040 
616 
1385 
1002 
620 
377 
663 
746 
605 
645 
746 
792 
934 
734 
656 
696 
604 
609 
1126 
773 
661 
566 
463 
202 
794 
910 
597 



694 
538 
477 
635 
237 
1048 
667 
797 
532 
649 
546 
722 
1066 
621 
762 
616 
787 
933 
1076 
616 
1301 
677 
452 
857 
602 
692 
625 
569 
1162 
724 



-323 
•7.5 
-15.0 
-7.7 
-123 
-4.1 
-64 
-43 
-9.7 
-6.5 
-1.6 
-25.7 
-15.6 
-11.3 
-5.5 
-4.2 
-3.7 
-10.4 
-13.2 
•31.1 
•3.6 
-16.8 
-3-2 
-6 5 
«-35.0 
•6.0 
-333 
-0.9 
-8.2 
-8.1 
-18.9 
-6.0 
-26.3 
-27.4 
-0.3 
-14.7 
-1.5 
-15.4 
-0.6 
-15.7 
-12.7 
-2.3 
-9 4 
-21.9 
-0.5 
>0.0 
•4.1 
-4.1 
<-35.0 
>0.0 
-23.3 
-11.1 
-29.5 
-0.4 
-2.1 
-2.1 
>0.0 
-25.3 
-6.9 
-6.4 
-10.6 
-12.6 
-8.7 
•11.6 
-0.3 
-4.5 
-9.9 - 
-5.1 
-11.1 
•9.6 
-2.1 
-10.2 
-0.6 
-17 J 
>0.0 



V CP** SOSMW 



52.600 
36300 
53.000 
123.200 
37.700 
67.900 
52.700 
57.200 
46.500 
96.300 
103.600 
36.900 
34.000 
56.300 
67.300 
100.200 
47.100 
66,700 
28.900 
37.800 
16.900 
27.000 
48.000 
72.000 
45.500 
41.200 
49.000 
46.600 
41.200 
39.000 
33.000 
41.800 
45.800 
43.700 
49.100 
48.700 
2S.600 
39.900 
36.000 
51.600 
58.500 
142.300 
38.900 
34.000 
49.500 
34.600 
53.700 
59.100 
33.000 
116.000 
28.600 
45.200 
36.800 
54.200 
46.300 
53.100 
42.400 
28.000 
48.000 
40.400 
36.000 
39.300 
33.100 
27.600 
46.300 
19.700 
44.700 
61.700 
36.200 
38.600 
34.700 
37.500 
51.400 
23.800 
42.300 



1153 
1154 
1161 
1162 
1163 
1168 
1170 
1171 
1172 
1174 
1176 
1177 
1176 
1179 
1180 
1161 
1162 
1183 
1164 
1165 
1186 
1189 
1190 
1191 
1192 
1183 
1194 
1195 
1196 
1197 
1198 
1199 
1200 
1201 
1202 
1200 
1204 
1205 
1206- 
1209 
1210 
1211 
1212 
1214 
1215 
1216 
1217 
1216 
1219 
1220 
1221 
1222 
1223 
1224 
1225 
1226 
1227 
1228 
1229 
1230 
1231 
1232 
1233 
1234 
1235 
1236 
1237 
1238 
1239 
1240 
1241 
1242 
1243 
1244 
1245 



921 
ISM 
637 
623 
665 
564 
562 
538 
545 
1099 
1304 
1366 
1606 
1465 
1459 
1431 
1407 
1383 
1454 
1422 
1394 
1171 
1457 



1156 



265 
403 
344 
505 
572 
639 
637 
614 
637 
1095 
1719 
791 
964 
313 
306 
320 
326 
394 
402 
366 
641 
660 
914 
673 
970 
1021 
1392 
1354 
1362 
673 
614 
603 
696 
707 
475 
466 
759 
1324 
1563 
1665 
1812 
1411 
1392 
7M 
769 
740 
743 
713 
662 
663 
565 



400 
397 
397 
528 
529 
524 
514 
522 
566 
539 
702 
224 
224 
223 
223 
224 
162 
163 
182 
214 
266 
1114 
693 
1292 
1275 
1311 
1293 
1502 
1402 
1407 
1431 
1394 
1545 
668 
1021 
195 
194 
197 
197 
294 
294 
294 
329 
329 
266 
245 
372 
296 
205 
203 
205 
540 
542 
539 
623 
628 
447 
1282 
1461 
1170 
1005 
609 
617 
703 
662 
410 
407 
406 
511 
510 
509 
504 
562 



-13.7 
-33 
-21 J 
-21 J 
•20.2 
-24.4 
•25.0 
-25.9 
-25.5 
•10.2 
-7.5 
-66 
-3.3 
-4.6 
.-5.2 
-5.7 
-6.1 
-64 
•5.3 
•5.8 
-6.3 
•0.2 
•5.2 
•10.5 
<-35.0 
-32.6 
<-35.0 
-27.6 
-24.1 
-21.2 
-21.3 
-22 1 
-21.3 
-10.3 
-2.1 
-16.5 
-12.9 
<-35.0 
«*35.0 
<-3S.0 
<-35.0 
-33 2 
-32 7 
■33.7 
-21 .2 
-20 4 
-13.8 
-14.7 
-12.7 
•11.6 
-6.3 
-6.8 
-6.7 
-199 
-221 
•226 
-19.2 
-18.9 
•28.7 
-29.0 
-17.4 
-7.2 
•3.6 
-0.6 
-1.0 
•6.0 
-6.3 
•164 
•17.1 
•17.9 
•17.8 
-18.7 
•19.6 
20.3 
24.4 



24.700 
35.900 
66.400 
66.800 
68.700 
54.500 
54.500 
54.800 
55.700 
55.000 
50.200 
53.700 
43.400 
124.900 
124.900 
125.100 
125.200 
124,700 
164.400 
162.600 
164.300 
131.800 
94.200 
26.200 
34.700 
20.000 
20.600 
19.400 
20.000 
13.000 
16.300 
16.200 
15.400 
16 600 
11.600 
45.200 
29.700 
148.700 
149.800 
147.400 
146.600 
91.400 
91.200 
91.400 
61.600 
81.600 
101.800 
112.000 
72.900 
90.100 
139.500 
141.800 
139.500 
53.600 
53400 
53.600 
47.800 
47.500 
62.300 
20.400 
W 400 
24.200 
30.300 
38,200 
37.900 
43.400 

44.500 

66.900 
67.300 
67.500 
55.900 
56,000 
56.100 
56.500 
50.500 



1246 

1247 

1249 

1250 

1251 

1252 

1253 

1254 

1255 

1257 

1256 

1259 

1260 

1261 

1262 

1263 

1264 

1265 

1266 

1267 

1268 

1269 

1270 

1271 

1272 

1273 

1274 

1277 

1278 

1279 

1280 

1281 

1282 

1283 

1284 

1285 

1266 

1287 

1288 

1289 

1290 

1291 

1292 

1293 

1294 

1295 



547 
530 
516 
973 
607 
665 
899 
1311 
1300 
1938 
1806 
1727 
1629 
1555 
1468 
1413 
1340 
1263 
1162 
1110 
1055 
999 
959 
905 
857 
810 
774 
737 
702 
671 
645 
617 
595 
573 
552 
536 
515 
496 
467 
447 
427 
412 
397 
361 
365 
348 



577 
576 

572 

536 

532 

529 

766 

746 

761 

712 

718 

715 

713 

717 

717 

722 

717 

717 

720 

717 

717 

717 

715 

712 

714 

705 
711 
708 
711 
710 
710 
707 
704 
700 
695 
694 
687 
663 
669 
667 
655 
655 
652 
654 
6S3 

653 « 



•253 
-263 
-27.0 
-12.7 
-224 
-20.2 
-14.1 
-7.4 
-7.5 
0.0 
-1.0 
-2.0 
-3.0 
-4.0 
-5.0 
-6.0 
•7.0 
-6.0 
-9.0 
♦10.0 
•11.0 

•12.0 

•13.0 

•14.0 

•15.0 

•16.0 

-17.0 

•18.0 

•19.0 

-20.0 

•21.0 

-22.0 

•23.0 

•24.0 

-25.0 

-26.0 

-27.0 

-28.0 

-29.0 

-30.9 

-31.0 

-32.0 

-33.0 

-34.0 

•35.0 

•35.0 



50.800 

50.800 

51.200 

53.000 

54.200 

54 400 

40.200 

41.200 

40.400 

42.900 

42.600 

42.700 

42.800 

42.600 

42.600 

42.400 

42.600 

42.600 

42.500 

42.600 

42.600 

42.600 

42.700 

42.000 

42.800 

43.300 

42.900 

43.100 

42.900 

43.000 

43.000 

43.100 

43.300 

43.500 

43.700 

43.800 

44.200 

44.400 

45.200 

45.300 

45.900 

45.900 

46.100 

46.000 

46.100 

46 100 
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3. Computed pr% of two sets of cwbaay Uied 
bemofJobtn f Hb) 



protein suo*^ ^ ^ cpK ^ ^ 



PIR 



Protein Name 

Rabbit musde CPK KIRBCM 



rame 3.9 4.1 6.0 10.8 1ZS 7.0 



Hb-beta, human 



HBHU 



28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 
28 



27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 
27 



17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 
17 



34 

33 
32 
31 
30 
29 
28 
27 
26 
25 
24 
23 
22 
21 
20 
19 
18 
17 
16 
15 
14 
13 
12 
11 
10 
9 
8 
7 
6 
5 
4 
3 
2 
1 
0 
0 



18 
18 
16 
18 
18 
18 
18 
18 
16 
16 
18 
18 
18 
18 
18 
18 
18 
18 
18 
18 
18 
18 
18 
18 
18 
16 
18 
18 
18 
18 
18 
18 
18 
18 
18 
18 



1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
0 



Ol 

6\84 

6.67 

6.54 

6.42 

631 

6.21 

6.12 

6.03 

5.94 

5.85 

5.76 

5.67 

5.58 

5.48 

5.39 

5.29 

5.20 

5.12 

5.04 

4.96 

4.89 

4.83 

4.77 

4.71 

4.66 

4.61 

4.56 

4.52 

4.48 

4.44 

4.40 

4.36 

4.32 

4.29 

4.25 

4.22 



ReaJ 
CPK 

O0 
-1 
-2 
♦3 
-4 
•5 
-6 
-7 
-8 
-9 
-10 
-11 
•12 
•13 
•14 
-15 
•16 
-17 
-18 
•19 
-20 
-21 
-22 
•23 
-24 
-25 
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Protein Name 

Creatine phospho kinase (CPK). rabbit musde 
Fatty aod-binding protein, rai hepatic 
b2-microglobubn. human 
Caroamoyl-phosohate synthase, rat 
Proalbomin ( serum aioumin precursor), rat 
Serum albumin, ra: 

Superoxid dismuiase (Cu-Zn. SOD), rat 

Phospholipase C. phophoinosmoe-specific (?). rat 

Albumin, human 

Apo A-l lipoprotein, ra: 

proApo A-l lipoprotein, human 

NADPH cytochrome P-450 reductase, rat 

Retinol binding protein, human 

Actin beta, rat 

Actin gamma, rat 

Apo A-l lipoprotein, human 

Apo A-IV lipoprotein, human 

Tubulin alpha, ra! 

Ft ATPase beta, bovine 

Tubulin beta, pig 

Protein disulphide isomerase (PDl). rat hepatic 

Cytochrome 05. rai 

Ado C-ll liooorotein. human 

Amino aad pi assumeo i.n calutation: 
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High Specific Activity Chemiluminescent and 
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'Multi-analyte' Immunoassays 
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The sensitivities of immunoassays relying on conventional radioisotopic labels (i e 
radioimmunoassay (RIA) and immunoradiometric assay (IRMA)) permit the measurement of 
analyte concentrations above ca 10 7 molecules/ml. This limitation primarily derives in the 
case of 'competitive' or 'limited reagent' assays, from the 'manipulation errors arising in the 
system combined with the physicochemical characteristics of the particular antibody used- 
however, in the case of 'noncompetitive' systems, the specific activity of the label may play a 
more important constraining role. It is theoretically demonstrable that the development of 
assay techniques yielding detection limits significantly lower than 10 7 molecules/ml depends 



on 



(1) the adoption of 'non-competitive' assays designs; 

(2) the use of labels of higher specific activity than radioisotopes; 

(3) highly efficient discrimination between the products of the immunological reactions 
involved. 

Chemiluminescent and fluorescent substances are capable of yielding higher specific activities 
than commonly used radioisotopes when used as direct reagent labels in this context, and both 
thus provide a basis for the development of ultra-sensitive', non-competitive, immunoassay 
methodologies. Enzymes catalysing chemiluminescent reactions or yielding fluorescent 
reaction products can likewise be used as labels yielding high effective specific activities and 
hence enhanced assay sensitivities. 

A particular advantage of fluorescent labels (albeit one not necessarily confined to them) lies 
in the possibility they offer of revealing immunological reactions localized in microspots' 
distributed on an inert solid support. This opens the way to the development of an entirely new 
generation of 'ambient analyte' microspot immunoassays permitting the simultaneous 
measurement of tens or even hundreds of different analytes in the same small sample using 
(for example) laser scanning techniques. Early experience suggests that microspot assays with 
sensitivities surpassing that of isotopically based methodologies can readily be developed. 

Keywords: Ultrasensitive immunoassay; fluorescent microspot immunoassay confocal 
microscopy 
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INTRODUCTION 

Immunoassay methods relying on radioisotopic 
labels have played a major role in medicine and 
other biologically related fields (agriculture 
veterinary science, the food and pharmaceutical 
industries, etc.) during the past two decades. 
Their importance has derived from the exploita- 
tion both of the 'structural specificity' characteriz- 
ing antibody-antigen reaciions and the 'detecta- 
bility' of isotopicallv-labelled reagents, the latter 
permining observation of the binding reactions 
between exceedingly small concentrations of the 
key reactants involved. The combination of these 
feaiures has endowed radioimmunoassay 
methods with unique specificity and sensitivity 
characteristics, and accounts for their ubiquitous 
use throughout modern medicine and biology. 
However, in the past few years, interest has 
increasingly focused on so-called 'alternative', 
non-radioisotopic, immunoassay methods: such 
techniques are based on essentially identical 
analytical principles but differ in the markers used 
to label the particular immunoreactant (antibody 
or analyte) whose distribution between bound 
and free moieties (following the basic analytical 
reaction) constitutes the assay 'response'. The 
reasons for this interest may be grouped under 
four headings: 



0) Environmental; logistic; economic; practical- 
ity and convenience, etc. (i.e. 'non-scientific). 

(2) The attainment of higher sensitivity. 

(3) The development of 'immunosensors' and 
'immunoprobes'. 

(4) The development of 'multi-analyte' assay 
systems. 

Our own reasons for developing non-isotopic 
techniques fall principally under headings (2) and 
(4), and this presentation will centre primarily on 
the concepts which underlie our immunoassay 
development strategy in these areas. 



THE ATTAINMENT OF 'ULTRA-HIGH' 
IMMUNOASSAY SENSITIVITY 

Though, as indicated above, the sensitivity of 
radioisotopically based immunoassay methods 
has constituted one of the principal foundations 
of their widespread use over the past 25 years, a 



fundamental reason for their replacement stems 
paradoxically, from the current requirement to 
develop microanalytical techniques which are 
superior to them in this particular respect 
Radioisotopic methods are, in practice, limited to 
the measurement of analyte concentrations above 
about 10M0 9 molecules/ml (i.e. approx 0.15-1 5 
pmol/l)(Dakubu et al., 1984). However, in certain 
fields (e.g. virology, tumour detection) there is a 
particular need to detect or measure molecular 
concentrations below this level. The factors which 
determine immunoassay sensitivity have been 
extensively discussed (Ekins et al., 1968, 1970a- 
Ekins, 1978; Jackson et al:, 1983; Dakubu et al ' 
1984; Ekins, 1985). Nevertheless, some of the 
underlying concepts are still frequently misunder- 
stood and merit brief discussion in the present 
context. 



The concept of sensitivity 



One major source of past confusion has been 
disagreement regarding the concept of 'sensitiv- 
ity' itself, many authors equating assay sensitivity 
with the slope of the dose-response curve (Yalow 
and Berson, 1970a, b; Berson and Yalow 1973- 
see also Ekins et al., 1970b, Tait, 1970). It is now 
widely agreed that the notion that a steeper 
dose-response curve implies greater sensitivity is 
erroneous. The invalidity of this belief is clearly 
revealed by the fact that the relative magnitudes 
of the responses yielded by two assay systems is 
dependent on the particular variable which is 
chosen to represent the response (see Fig 
l(a))(Ekins, 1976). For this and other reasons it 
has long been recognized that the 'sensitivity''of 
an assay can only be satisfactorily represented by 
its lower limit of detection (Fig! 1(b)), and this 
concept is now embodied in all internationally 
agreed definitions of the term. An essentially 
identical definition is as the precision (i.e 
standard deviation) of measurement of zero dose 
since this quantity determines the least quantity 
distinguishable from zero and hence the assay 
detection limit. The sensitivity of an assav is thus 
represented by the zero-dose intercept' of the 
'precision profile' (Fig. 2(a)) when the latter is 
expressed in terms of standard deviation rather 
than of coefficient of variation (Ekins, 1983a). In 
short, the more sensitive of two assays is the one 
yielding greater precision of the zero dose 
estimate (Fig. 2(b)). 
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F/E plot 



B/F plot 



Any plot 




r 3 ; 




a. 



b. 



Ficure 1. (a) Diagrammatic representation of conventional RIA dose-response curves for systems using high (hi) and low do) 
antibody concentrations plotted in terms of free-bound (F/E) and bound/free (B/F) labelled antigen. Note that the use of a lower 
amount of antibody yields a dose-response curve of greater slope in the F/B plot, but of lower slope in the B/F plot. It is 
impossible to decide, on the basis of the data shown in this figure, which concentration of antibody yields the assay system of 
higher sensitivity, (b) The sensitivity of an assay is essentially represented by the minimum detectable dose. i.e. the SD of the 
dose measurement (SD t0 o Se )) at zero dose. This is given by the SD of the response (SD (response) ) divided by the dose-response 
curve slope at zero dose (i.e. ((SD„ esponsc) ) x dD/dfl) 0 ). This quantity is unaffected by the choice of the coordinate frame used to 
plot the dose-response curve. (Note: it is common to multiply (SD laose) ) 0 by an arbitrary factor to increase the confidence level 
attaching to the minimum detectable dose estimate, though, since no agreement exists regarding the value of this factor, this 
unnecessary step merely adds to confusion when the relative sensitivities of two assay procedures are compared.) 



'Competitive' end 'non-competitive' ('limited 
reagent' and 'excess reagent') assays 

A second important misconception in this area is 
the notion that immunoassays relying on the use 
of labelled antibodies (e.g. immunoradiometric 
assays, 1RMA) are ipso facto more sensitive than 



those which rely on the use of labelled 'analyte' 
(e.g. radioimmunoassays, RIA); furthermore the 
grounds originally advanced for the claimed 
superiority of labelled antibody methods (Miles 
and Hales, 1968) were partially based on false 
concepts of sensitivity, and thus failed to identify 
the true reasons why certain assay designs are 
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Figure 2. (a) The 'precision profile' of an assay portrays the error in the dose measurement as a function of dose. The error 
may be represented, inter alia, by the absolute error (AD; e.g. SD of D) or the relative error (A0/D; e.g. CV of D). (AD) 0 . the 
error in the measurement of zero dose, represents the sensitivity of the assay. The working range may be defined as the range 
of dose values within which AD/D is less than an 'acceptable' value set by the investigator, (b) The more sensitive of the two 
assays (assay I) intercepts the AD axis at a lower value. However, assay II is more precise at higher values of dose, and has a 
wider working range. 
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potentially capable of yielding far higher sensitiv- 
ity than others. This issue likewise merits 
clarification. 

The purely pragmatic sub-classification of 
immunoassays into labelled antibody and labelled 
analyte methods diverts attention from a more 
fundamental divide in immunoassay methodolo- 
gy, which relates to the optimal concentration of 
antibody required in an assay system to maximize 
its sensitivity. In certain assay designs (which may 
be termed 'limited reagent' or 'competitive') the 
optimal concentration tends to zero; conversely in 
others (which may be termed 'excess reagent' or 
'non-competitive') the concentration tends to 
infinity. It should be particularly emphasized that 
the optima] antibody concentration is essentially 
governed, not onJy by the physicochemical charac- 
teristics of the antibody-analyte binding reaction, 
but also by the errors incurred in measurement of 
the assay response. Were an assay system to be 
totally error-free, no antibody concentration 
would be optimal, and the distinction between 
competitive and non-competitive methodologies 
would thus not arise. 

Though it is inappropriate in this presentation 
to discuss in detail the statistical and physico- 
chemical theory underlying this fundamental 
divergence in immunoassay design (see Ekins et 
aL, 1968, 1970a; Jackson etaL, 1983), the reason 
for it can perhaps be more readily understood if 
the basic principles of immunoassay are portraved 
in a somewhat different way from that in which 
they are usually presented. All immunoassays 
essentially depend upon measurement of the 
'fractional occupancy' by analyte of antibody 
binding sites following reaction of analyte with 
antibody (see Fig. 3(a)). Those techniques which 
implicitly rely on measurement of residual, 
unoccupied, binding sites optimally necessitate 
the use of concentrations of antibody tending to 
zero, and may be termed 'competitive', converse- 
ly those in which occupied sites are directly 
measured necessitate use of high antibody con- 
centrations and are termed 'non-competitive' 
(Fig. 3(b)). This emphasizes that the differences 
in assay design characterizing so-called competi- 
tive and non-competitive methods are essentially 
unrelated to which component (if any) of the 
reaction system is labelled. Indeed immunoassays 
in which no label of any kind is involved can, on 
identical grounds; be subdivided into those of 
'limited reagent 1 (or 'competitive') and 'excess 
reagent' (or 'non-competititve') design. Thus the 



distinction between these two forms of im- 
munoassay simply reflects differences in the way 
that fractional antibody occupancy is determined, 
and the fact that it is generally undesirable— for 
reasons of accuracy— to measure a small quantity 
by estimating the difference between two large 
quantities. When an immunoassay relies on the 
measurement of unoccupied antibody binding 
sites, the total amount of antibody used in the 
system must be small to minimize error in the 
resulting (indirect) estimate of occupied sites. 
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Figure 3. The distinction between 'non-competitive' (above) 
and 'competitive' immunoassays (below) reflects how 
antibody b.ndmg-site occupancy • is measured. Labelled 
fhi , ^ Y ,! T1 !v th0dS . are ' non - com Petitive' if occupied sites of 
he (labelled) antibody are measured, but are 'competitive' 
(below right) when unoccupied sites are measured Labelled 

™! 9 k G h {b <?T ,6ft) ° r ,abe,,ed an ^diotypic' antibody 
methods (below centre) rely on measurement of sites 
unoccupied by analyte. and are therefore invariably, of 
competitive' design. y 
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Curves showing the theoretically predicted relationship between antibody affinity and the sensitivities achievable 
repetitive' and 'non-competitive' essay strategies. The 'potential' sensitivity curves assume the use of infinite specific 



Figure 4. 

using 'competitive' ana non-competitive essay strategies. The 'potential' sensitivity curves assume the use of infinite specific 
activity labels; the sensitivities achievable using 125 l-labelled antigen or antibody are also shown. Shaded areas indicate the 
sensitivity loss due to errors in measurement of the label. Curves relating to 'competitive' assays assume a 1% error in 
measurement of the response variable arising from 'experimental' errors (i.e. errors other than those inherent in label 
measurement per se). Non-competitive curves assume non-specific binding' of labelled antibody of 0.01 % and 1 % (lower and 
upper curves) respectively. Arrows indicate sensitivities claimed for typical non-competitive immunoassay methodoloaies. 



Conversely, when occupied sites are measured 
directly, this particular constraint does not arise; 
indeed, considerable advantage often derives 
from using relatively large amounts of antibody in 
the system. 



Sensitivity of 'competitive' and 
'non-competitive' immunoassays 

Competitive and non-competitive immunoassays 
differ significantly in many of their performance 
characteristics in consequence of the differences 
in optimal antibody concentration on which they 
rely. Most particularly they differ in their 
potential sensitivities. Figure 4. portrays the 
sensitivities predicted theoretically as a function 
of antibody binding affinity, making realistic 
assumptions regarding the experimental errors 
incurred in reagent manipulation, 'non-specific' 
binding of labelled antibody, etc., and assuming 
the use of optimal reagent concentrations (Ekins, 
1985). Amongst other concepts illustrated in the 
figure is the much greater assay sensitivity 
potentially attainable (using an antibody of given 
affinity) by adoption of a non-competitive 
approach. In short, whereas the maximal sensitiv- 



ity realistically achievable using a competitive 
design is in the order of 10 7 molecules/ml (using 
antibody of the highest affinity found in practice), 
a non-competitive method is capable of yielding 
sensitivities some orders of magnitude greater 
than this. However, Fig. 4 also demonstrates that, 
assuming the use of high affinity antibodies (i.e. 
-10 u -10 12 1/m), maximal sensitivities yielded by 
isotopically based techniques (whether relying on 
labelled antibody (IRMA) or labelled analyte 
(R1A), or whether of competitive or non- 
competitive design) are closely comparable, i.e. 
of the order of 10 7 -10 8 molecules/ml. 

This limitation is a manifestation of the fact 
that, in the case of the non-competitive methods, 
an important constraint on assay sensitivity is 
(under certain circumstances) the 'specific activ- 
ity' of the label used. On the other hand, 
limitation of assay sensitivity due to the low 
specific activity of radioisotopic labels does not 
often arise, in practice, in the case of competitive 
assays, whose sensitivity is generally restricted by 
other factors (Ekins, 1985). The fundamental 
significance of this conclusion is that, only by the 
use of labels possessing specific activities higher 
than those of the commonly used radioisotopes in 
assays of non-competitive design, can current 
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sensitivity limits be breached. Conversely use of 
a higher specific activity label in a competitive 
assay will usually have no significant effect on its 
sensitivity (assuming experimental errors incur- 
red in reagent manipulation of the magnitude 
generally encountered in practice). 



High specific activity non-isotopic labels 

The term 'specific activity' is conventionally 
applied, in the case of radioisotopic labels, to 
denote the number of radioactive disintegrations 
per unit time per unit weight of the isotope or 
labelled compound. In the present context, use of 
the term is widened to signify 'detectable events' 
per unit time per unit weight of labelled material 
Thus it can be used to indicate the rate of photon 
emission by a chemiluminescent or fluorescent 
label, or the rate of conversion of substrate 
molecules— by an enzyme label— to molecules of 
a detectable product. The importance of the 
concept derives from the fact that 'signal 
measurement error' (i.e. error in the measure- 
ment of the label per se) is a contributory factor in 
limiting assay sensitivity, and mav— when other 
sensitivity-constraining factors a're reduced- 
become dominant. Furthermore, when extending 
the sensitivities of immunoassay svstems beyond 
their present limits, the numbers' of molecules 
involved are low, and statistical errors incurred in 
counting individual 'detectable events' and the 
time required to count them, mav assume a 
particular importance. 

Table ] compares the specific activities of 
potentially useful labels with that of I25 l. All are 
of relevance in the context of this volume since 
chemiluminescent and fluorescent labels can be 
used to label antibodies (or antigens) directly 
alternatively, enzyme labels catalysing reactions 
yielding chemiluminescent signals or fluorescent 
products can be utilized. 



Table 1. Relative specific activities of various 
■fotopic and non-isctopic labels. Note that XouX 
the specific activity of ,2s l-labe..ed re gents doS 
not, m pract.ce, significantly limit the sensitivity of 
competitive essays (see Fig. 4), the lower specif?c 
ectivity of *H may severely restrict the sensiS 
of competmve assays (e.g. of steroid hormones) 
which rely on the use of this particular radioiso- 

Specific Activities 



13S| . 



3 H: 



Enzymes: 



Chemiluminescent 

labels 
Fluorescent labels: 



1 detectable event/sec/7.5 x io 6 
labelled molecules. 
1 detectable event/sec/5.6 x 10 8 
labelled molecules. 
Determined by enzyme 'amplifica- 
tion factor' and delegability of 
reaction product. 

1 detectable event/labelled mole- 
cule. 

Many detectable events/labelled 
molecule. 



The importance of background in 
non-competitive immunoassays 

A second important factor governing the sensitiv- 
ity of non-competitive labelled-antibody im- 
munoassays is the 'background' or 'blank' signal 
emitted in the absence of analyte. since error in 
the measurement of this signal is clearly a major 
determinant of the error in measurement of zero 



dose Amongst contributors to the background 

55? 1 ' n .° iSe ' ° f ,he measur i"g instrument 
itself, ambient signal generators (such as, in 

sandwjch immunoassays, solid 'capture- 
antibody supports or, in the case of radioisotopic 
methods, cosmic ray and other extraneous radia- 
non sources) and 'non-specifically bound' label- 
led antibody. Minimization of each of these 
components is essential for maximal sensitivity 
mere arithmetic subtraction of background is of 
absolutely no benefit in this context 

Non-specific binding of antibody is of particular 
interest since the magnitude of this contribution 
is dependent, inter alia, on the amount of labelled 
antibody used in the system, and the duration of 
ns exposure to analyte. Thus increasing the 
amount of labelled antibody increases the amount 
of such antibody bound to analyte; however it 
may also increase the non-specifically bound 
moiety to a greater proportional extent, and thus 
cause a net reduction in sensitivity. This effect 
underlies the loss in sensitivity at higher antibody 
concentrations depicted in Fig. 5 (reproduced 
from Jackson et al., 1983). This phenomenon also 
underlies the relationship between sensitivity and 
the affinity constant of the labelled antibody 
depicted in Fig. 4. The possession by labelled 
antibody of a high affinity constant implies that a 
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Figure 5. Assay sensitivity (represented by the standard 
deviation of the zero dose measurement. o c ). plotted as a 
function of the concentration of labelled antibody (of affinity 
10 U L/M) used in the essay, assuming different levels of 
non-specific binding of labelled antibody. (Note: an irreducible 
instrument background has been assumed in the computa- 
tions represented; this limits the ultimate sensitivity attain- 
able, regardless of the concentration of antibody used.) 



lower concentration is required to yield the same 
level of analyte binding, albeit with reduced 
non-specific binding, thus increasing assay sensi- 
tivity 

In summary, the high sensitivity of non- 
competitive labelled antibody methods derives 
essentially from their permitted use of optimal 
concentrations of antibody which (provided non- 
specific binding of labelled antibody is low) 
are generally considerably greater than in com- 
petitive methods, not from the fact that the 
antibody is labelled. Labelled antibody methods 
generally fall in sensitivity as the concentration of 
antibody is reduced towards zero, ultimately 
yielding a sensitivity theoretically identical to that 
of competitive methods (Rodbard and Weiss, 
1973). (Paradoxically, early exponents of labelled 
antibody methods, whilst claiming them to be of 
higher sensitivity, also concluded that their 
sensitivity was increased by reduction in the 
amount of labelled antibody used (Woodhead ex 
aL, 1971). This incorrect conclusion — based on 
observation of effects on the slope of the 
dose-response curve — exemplifies the many falla- 
cies encountered in the immunoassay field stem- 
ming from confusion regarding the concept of 
sensitivity discussed above.) Finally it should be 



emphasized that maximization of the sensitivity of 
a non-competitive immunoassay generally implies 
the selection of reagent concentrations and other 
experimental conditions such that the [analyte 
signal/background] ratio (i.e. sib) is maximized. 
However, this simple relationship disregards 
statistical considerations which arise when the 
numbers of detectable events are very low, and a 
more appropriate objective may, under these 
circumstances, be maximization of the ratio s 7 lb 
(Loevinger and Berman, 1951). 



Other performance characteristics of 
competitive and non-competitive 
immunoassays 

Non-competitive designs also display a number of 
other advantages deriving from the relatively high 
antibody concentrations on which they generally 
rely. These include increased reaction speeds 
(and hence shorter incubation times), decreased 
vulnerability to certain environmental effects 
(which cause variations in binding affinity be- 
tween antibody and analyte), reduced sensitivity- 
dependence on high antibody binding affinity, 
etc. 

Nevertheless a price has to be paid for these 
benefits; this includes the greater tendency of a 
large amount of antibody to bind molecules 
differing from, but with structural resemblance 
to, the analyte itself, implying a loss of assay 
specificity. This effect generally necessitates the 
use, whenever possible, of an 'immunoextraction' 
procedure using a second 'capture' antibody 
(usually directed against a different binding site, 
or 'epitope') as shown in Fig. 3(b). This 
technique — the 4 sandwich' or 'two-site' im- 
munoassay (Wide, 1971) — thus potentially com- 
bines the twin virtues of ultra-high sensitivity and 
specificity (together with short reaction time), 
features of crucial importance in many diagnostic 
situations (for example, in the detection of AIDS 
viral antigens). (Note, however, that the loss of 
specificity inherent in non-competitive assay 
designs implies that they are less readily applic- 
able to the measurement of analytes of small 
molecular size, which cannot be simultaneously 
bound by two different antibodies directed 
against different antigenic sites on the molecule. 
Such analytes are generally more appropriately 
measured using 'competitive' assay methods.) 
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Development of ultra-sensitive 
immunoassay methodologies 

The perception that the development of 'ultra- 
sensitive' immunoassay systems (i.e. systems 
surpassing conventional R1A methods in sensitiv- 
ity) depends on (a) reliance on 'excess reagent' or 
'non-competitive' assay designs; (b) the use of 
non-isotopic labels displaying higher specific 
activities than commonly used radioisotopes: (c) 
the development of efficient separation systems 
(ensuring minimization of non-specific antibodv 
binding, and hence of signal 'backgrounds'), and 
(d) dual or multi-antibody analyte-recognition 
systems (exemplified by 'sandwich' or two-site 
assays) 10 maintain/increase assay specificity has 
formed the basis of our own laboratory's' im- 
munoassay development since the early to mid- 
1970s (Ekins, 1978). This led us, inter alia, to an 
immediate recognition (Ekins, 1979, ]980) of the 
importance of the in vitro techniques of mono- 
clonal antibody production pioneered by Kohler 
and Milstein (1975), which are currently the 
subject of bitter patent disputes in the USA 
(Ezzell, ]986, 1987a,b), and which may be 
expected in Europe. 

Meanwhile, of the candidate labels for use in 
this context, both chemiluminescent and fluores- 
cent labels offer many attractions. The develop- 
ment of stable, highly chemiluminescent, acridi- 
nium esters by McCapra and his colleagues 
(McCapra et al., 1977) has subsequently been 
exploited by Weeks et al (1983, 1984) and, more 
recently, by several commercial kit manufactur- 
ers; other workers have used more conventional 
chemiluminescent compounds to label immuno- 
assay reagents (see, for example, Kohen et al 
1984, 1985; Barnard etal, 1985). Yet others have 
relied on enzyme labels to catalvse chemilumi- 
nogenic (Whitehead erai, 1983) and fluorogenic 
(Shalev et a/., 1980) reactions as indicated above. 
Detailed description of these various methodolo- 
gies is presented by others in this volume and 
need not be duplicated here. 

Common to all the 'ultra-sensitive' immuno- 
assay methodologies relying on such alternative 
labels is their dependence on a non-competitive, 
labelled antibody, assay strategy whenever 
appropriate; however, for the reasons indicated 
above, competitive methods continue to be 
generally employed for the measurement of 
analvtes of small molecular size (e.g. therapeutic 
drugs, steroid and thyroid hormones, etc.). 



Nevertheless, the convenience (from a manufac- 
turing viewpoint, and for other technical reasons) 
of relying on standard labelling procedures has 
meant that, even in these cases, labelled antibody 
techniques are increasingly preferred. Though the 
commercial kits based on these various labels 
differ to a minor extent in sensitivity, specificity 
convenience, etc., such differences are at least 
partially attributable to differences in the physi- 
cochemical characteristics of the antibodies used 
in the kits, and to other 'immunological' factors 
unconnected with the particular nature of the 
label per se. 

Despite the obvious attractions of chemi- 
luminescent techniques in an immunoassay con- 
text the use of fluorescent labels combined with 
sophisticated time-resolution techniques for their 
detection (a concept arising from discussions with 

mid MoJf? m aPPCared t0 US ( jn th * 

mid-19/Os) to offer more exciting long-term 
possibilities for a number of reasons. These 
naturally included attainment of the enhanced 
specific activities and high signal to background 
ratios required for ultra-sensitive immunoassay as 
indicated above. However, more importantly 
fluorescence techniques also appeared to provide 
a simple route to the development of 'multi- 
analyte assay systems of the kind described 
below. 

In pursuance of this strategy, we began 
collaboration with LKBAVallac, ca 1976-77 in 
the development of the instrumentation and 
technology required to develop such methods 
Fortunately a group of fluorescent substances 
generally known as the lanthanide chelates 
(including, in particular, the chelates of euro- 
pium, samarium and terbium facilitate such 
development, possessing prolonged fluorescence 

? e ?nVTM~^ 10 °°^ ,ar ? e Stok « shift 
(~300nm) and other desirable physical character- 
istics which permit the construction of relatively 
cheap mstrumentation for their measurement 
(Marshall ei al., 1981; Hemmila etal., 1983) The 
fluorescent properties of the lanthanide chelates 
may be compared with those of a conventional 
fluorophor such as fluorescein which is characte- 
rized by a much smaller Stokes shift (~28nm) 
and a fluorescent decay time and emission 
spectrum which imply that it is less readily 
distinguished from fluorescent substances present 

holdt< £ UCh 35 bili J, Ubin) 0r in P^stic sample 
of t LI?" flu °"«nce characteristics 

of the lanthanide chelates thus permit them to be 
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measured in the presence of a fluorescence 
background (deriving from extraneous sources) 
which, in practice, approaches zero. Fig. 6 
illustrates the basic concepts involved in pulsed- 
light, time-resolved, fluorescence measurement, 
which form the basis of the DELF1A immunoas- 
say system currently marketed by LKBAVallac. 

Though it is inappropriate to pursue this 
subject in greater detail, attention should also be 
drawn to the possibilities offered by phase- 
resolved fluorimetry. This permits separate iden- 
tification of fluorophores differing in fluorescence 
lifetime by their exposure to a sinusoidally 
modulated exciting light source, and observation 
of their demodulated, phase-shifted, light emis- 
sion (McGown and Bright, 1984). This technique 
offers the possibility both of the development of 
homogeneous assays (relying on a difference in 
fluorescence decay time of bound and free forms 
of the fluorescent-labelled molecule), and of 
discriminating between two labelled antibodies in 
the context of multi-analyte 'ratiometric' im- 
munoassay as discussed below. 

'AMEIENT ANALYTE' IMMUNOASSAY 

Before proceeding to a discussion of the develop- 
ment of multi-analyte assays, another important 
concept, termed Ambient analyte immunoassay' 
(Ekins, 1983b), must first be examined. This 
term is intended to describe a type of immuno- 
assay system which, unlike unconventional 




methods, measures the analyte concentration in 
the medium to which an antibody is exposed, 
being essentially independent both of sample 
volume, and of the amount of antibody present. 
This concept is illustrated in Fig. 7, and relies on 
the physicochemically-based proposition that, 
when a 'vanishingly small' amount of antibody 
(preferably, but not essentially, coupled to a solid 
support) is exposed to an analyte-comaining 
medium, the resulting (fractional) occupancy of 
antibody binding sites solely reflects the ambient 
analyte concentration. Clearly the binding by 
antibody of analyte results in a depletion of the 
amount of analyte in the surrounding medium, 
but provided the proportion so bound is small 
(i.e. less than, for example, 1% of the total), such 
disturbance can be ignored. (This effect is closely 
analogous to that caused by the introduction of a 
thermometer into a medium possessing a much 
larger thermal capacity; the temperature disturb- 
ance caused by the thermometer itself is negligi- 
ble and can, in these circumstances, be disre- 
garded.) 

The principles of ambient analyte assay derive 
from the recognition that all immunoassays 
essentially depend upon measurement of the 
fractional occupancy' by analyte of antibody 
binding sites following reaction of analyte with 
antibody as discussed above (Figs 3. (a) and (b)). 
The fractional occupancy of ('monospecific' or 
'monoclonal') antibody binding sites in the 
presence of varying analyte concentrations, plot- 
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Figure 6. Basic principles of pulse-light, time resolved 
fluorescence. Fluorescence emitted by the fluorophor (typi- 
cally a europium chelate) is distinguished from background 
fluorescence, which decays more rapidly. 



Figure 7. Basic principle of 'ambient analyte' immunoassay 
(AAI). The fractional occupancy {F) of a vanishingly small 
amount of antibody (of affinity /0 is determined by the analyte 
concentration in the medium ([Anl). 
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ted against antibody concentration, is portrayed 
in Fig. 8. The fraction of analyte bound is also 
plotted in this figure. (Note: 'for the sake of 
generality, all concentrations in this figure are 
expressed in terms of UK, where K is the affinity 
constant of the antibody. For example, if K = 
]0 n L/M, a concentration of 0.1 x ] IK represents 
0.1 x 10- n M/L,or0.1 x ](T n x ](T 3 x 6 02 x 
10 23 = 6.02 x 10 8 molecules/ml.) 

It should be particularly noted that, at antibody 
concentrations of less than ca 0.01 x \IK antibody 
fractional occupancy is essentially dependent 
solely on the analyte concentration in the 
medium, and is independent of variations in 
antibody concentration. Thi: reflects the fact that 
this concentration of antibody binds less than 
approximately 1% of the analyte in the medium, 
irrespective of its concentration. This implies, for 
example, that the introduction of 10, 100. or 1000 
antibody molecules into a medium containing 
billions of analyte molecules will result, in each 
case, in virtually identical fractional antibody 
binding-site occupancy, the upper limit of anti- 
body concentration being determined by the 
antibody affinity constant. (An antibody concen- 
tration of 0.01 x UK is a hundred-fold less than 



that (1 x 1/K) necessary to bind 50% of a 'trace' 
amount of analyte (see Fig. 8), claimed by Berson 
and ^alow (1973) as maximizing assay 'sensitiv- 
ity' (i.e. the slope of the dose-response curve 
when expressed in terms of bound/free labelled 
analyte). This false conclusion has subsequently 
become incorporated into the mythologv of 
radioimmunoassay design which, regrettably a 
majority of kit manufacturers continue to accept ) 
The ambient analyte assay concept was origi- 
nally exploited in the original development of 
what has come to be known as 'two-step' free 
hormone immunoassay (Ekins et fl / M 1980), but it 
is clear that it is of far wider application, and can, 
in particular; be utilized in the construction of 
immunosensors and immunoprobes. One such 
example is a probe for the measurement of 
salivary steroids that is currently being developed 
in our laboratory. Comprising a small antibody- 
coated plastic 'dipstick' comparable in size and 
shape to a clinical thermometer, this device is 
intended to permit the measurement of salivary 
steroid levels without requiring the collection of 
saliva. However, the concept also underlies our 
approach to multi-analyte immunoassay also 
under development in our laboratory. 
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MULTI-ANAL YTE 'RAtlOMETRIC 
IMMUNOASSAY SYSTEMS 

The concepts relating to ambient' analyte im- 
munoassay and assay sensitivity outlined above 
are both exploited in our present development of 
a random access, multi-analyte, immunoassay 
technology capable of measuring, in the same 
small sample, virtually any number of individual 
analytes from selected analyte 'menus' (e.g. a 
hormone menu, viral antigen menu, an allergen 
menu, etc.). Many examples of a need to measure 
a multiplicity of different analytes in the same 
sample exist in medical diagnosis, for example, in 
the routine diagnosis of thyroid disease, where it 
is frequently necessary to measure a number of 
different hormones and thyroid-related proteins. 
At present, clinicians frequently experience diffi- 
culty in deciding on the best sequence of tests to 
arrive at a correct diagnosis. Such problems 
would be overcome were all relevant analytes 
measurable at a cost comparable to the cost of 
measurement of a single substance. Our own 
immediate objective is the development of a 
technology permitting the measurement of com- 
plete 'hormone profiles' using a single small blood 
sample. However, the need for 'multj-analyte\ or 
'random access' measurement is not confined to 
medical diagnosis: it also arises, for example, in 
the pharmaceutical industry (where there exists a 
requirement to ensure the purity of protein drugs 
synthesized by recombinant DNA techniques), in 
the food industry and elsewhere. Though still at 
an early stage, our approach to the achievement 
of this objective can be briefly indicated. 



Multi-analyte assay: general principles 

As discussed above, the notion of ambient 
analyte assay simultaneously introduces two 
extremely important and novel concepts: (a) that 
an estimate of analyte concentration can be based 
upon the use of an infinitesimal amount of 
'sampling' antibody, and (b) that such an estimate 
derives from a direct measurement of fractional 
antibody occupancy by analyte, irrespective of 
the exact amount of antibody used. It should be 
emphasized that the latter proposition is valid 
only in the context of ambient analyte assay, and 
is not true in current conventional immunoassay 
systems (in which fractional antibody occupancy 
depends both upon the amount of antibody in the 



system, and sample volume— see Fig. 8). In short, 
exposure of a small number of antibody mole- 
cules (in the form, for example, of a 'microspot' 
located on a solid support) to an analyte- 
containing fluid results in occupancy of antibody 
binding sites in the microspot reflecting the 
analyte concentration in the medium. Following 
such exposure, the antibody-bearing probe may 
be removed and exposed to a 'developing' 
solution containing a high concentration of an 
appropriate second antibody directed against 
either a second epitope on the analyte molecule if 
this is large (i.e. the occupied site), or against 
unoccupied antibody binding sites in the case of 
small analyte molecules (see Fig. 3(b)). (Note: an 
antibody simulating antigen, and reacting with 
unoccupied binding sites, is described as a 
'mirror-image anti-idiot ypic antibody'; the use of 
such an antibody instead of labelled antigen is 
convenient but not essential, and is suggested 
here merely to simplify illustration of the basic 
concepts involved.) 

Subsequently, an estimate of binding-site occu- 
pancy of the 'sampling' (solid phase) antibody 
located in the microspot may be derived by 
measurement of the ratio of signals emitted by the 
two antibodies forming the dual-antibody 'coup- 
lets'. This can be conveniently achieved by 
labelling the 'sampling' and 'developing' anti- 
bodies with different labels, for example, a pair of 
radioactive, enzyme or chemiluminescent mar- 
kers. Fluorescent labels are nevertheless particu- 
larly useful in this context because, by the use of 
optical scanning techniques, they permit arrays of 
different antibody 'microspots' distributed over a 
surface, each directed against a different analyte, 
to be individually examined, thus enabling 
multiple assays to be simultaneously carried out 
on the same small sample. Fig. 9 illustrates these 
basic ideas, and Fig. 10 such an array. 



Microspot immunoassay sensitivity: 
theoretical considerations 

The notion that it is, in principle, possible to 
measure an analyte concentration using a micros- 
pot of antibody comprising a number of antibody 
molecules in the range ca 10 ! -10 6 is likely, at first 
sight, to appear surprising, and may, indeed, 
provoke scepticism regarding the assay sensitivi- 
ties potentially attainable using this approach. 
Clearly a number of factors, such as the sensitivity 
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of the signal measuring equipment, the density of 
antibody molecules on the surface of the solid 
support, etc., are likely to play a part in 
determining final assay sensitivity. Such factors 
are, in turn, dependent on the efficiency with 
which the particular labels used can be detected, 
the adsorption properties of antibody supports,' 
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Figure 10. 'Multi-analyie' antibody array. Each antibody 
'microspof represents a 'vanishingly small' amount of 
antibody directed against an individual analyte. 



etc. Though these are obviously variable, reason- 
able estimates can be made of the order of 
sensitivities likely to be achieved on the basis of 
some simple theoretical calculations. To clarify 
the following discussion, it is assumed that 
'sensing' antibody can be uniformly and consis- 
tently coated on a solid matrix at a standard 
density, implying that only the 'developing' 
antibody need be labelled and measured in order 
to ascertain fractional occupancy of sensing 
antibody binding sites. 

Fig. ] ] illustrates the surface of an antibody 
microspot, of surface area /I (urn 2 ), and (uniform- 
ly) coated with antibody of affinity K{LIU) in a 
monomolecular layer of density D(molecules/ 
Mm ). Let us assume that the spot is exposed to an 
analyte-containing medium of volume i'(ml), and 
containing an analyte concentration C. molecules/ 
ml. The molecular concentration of antibody in 
the system is thus given by ADIv. (Note: the fact 
that antibody is situated on the surface of a solid 
support, and not evenly distributed throughout 
the medium, does not affect the extent of analyte 
binding at thermodynamic equilibrium, assuming 
that antibody binding sites are not impeded in 
their reactions and have not been damaged during 
the coating process.) 

Meanwhile, fractional occupancy (F) of anti- 
body binding sites by analyte (at equilibrium) is 



CHEMILUMINESCENT AND FLUORESCENT MARKERS 



71 



■Surlsce area \ 

>; A sq.nm. 



.Antibody density 

D molecules' sq. jim. 

; AntibodV affinity* 



Avogadro's number: 

N molecules/M 



l g{iT l V;* H Cr °f K ^ b,ent : sn : ,y ! e immunoassay. The microspot shown is assumed to be uniformly coated with antibody 
though if the due -lebe led antibody Tat.ometric' approach shown in Fig. 9 is adopted, uniform coating is hot MLeminft 
minimum fluid volume tor ambient analyie essay conditions to prevail (enabling adoption of the ratiometric approach) is shown 
Minimum test sample volume (M/S): A x 0 x K x 10 5 //V 



given by the equation: 

F 2 - F(Vq + plq + 1) = 0 (1) 

where p = analyte concentration, q = antibody 
concentration (both expressed in units of 1//Q. 

Thus, for antibody binding site concentrations 
-> 0 (i.e. 9 < 0.01), F + p); (see Fig. 8). 

Likewise, the fraction of analyte bound by 
antibody (/) at equilibrium is given by the equation: 

f 2 ~ fiyip + <?//? + 1) + 9//? = 0 (2) 

Thus, for analyte concentration 0 (i.e. p < 
0.01),/= q/(l + 9 ); (see Fig. 8). Furthermore, 
when q < 0.01, and when p 2* 0,/< 0.01. 

Expressed in units of 1//C; the concentration (q) 
in the assay of 'sensing' antibody situated on the 
microspot is given by DAK/(v x 6 x 10 20 ), (since 
Avogadro's constant, expressed as the number of 
molecules/mmol, is 6 x 10 20 (approximately)). 
The fraction of an analyte concentration -» 0 
which will be bound to the spot is therefore 
DAK/(v x 6 x 10 20 -I- DAK), implying that the 
number of analyte molecules bound to the spot is 
given by vCDAKI(v x 6 x 10 2 ° + DAK). 

Case 1: sandwich (two-site) assay. Following 
incubation of sample with antibody, we assume 
the sample is removed, and the microspot then 
exposed to a volume V(m\) of a solution of a 
second, labelled, 'developing' antibody of affinity 
K* (L/M) at a concentration given by Q 
(expressed in units of 1//C*). 



The fraction of analyte bound by labelled 
antibody (F*) at equilibrium is given by the 
equation: 

F* 2 - P(l/P + QIP + 1) + Qip = o (3) 

where P represents the analyte concentration in 
the developing-antibody solution, expressed in 
units of \IK\ i.e. vCDA KK*f[(v x 6 x lO 20 + 
DAK)V x 6 x 10 20 ]. 

Assuming P < 0.01, F* = Ql{\ + Q). (For 
example, if Q = 1, the fraction of analyte 
molecules bound by labelled antibody = 0.5 
approximately). Thus, since the number of 
analyte molecules bound to the spot is given by 
vCDAK/(v x 6 x 10 20 + DAK), the number of 
analyte molecules labelled by the second, de- 
veloping, antibody is given by vCDAKQ/[(v x 6 
x 10 20 4- DAK)(\ + Q)], and the surface density 
of such molecules is given by vCDKQ/[(v x 6 x 
10 20 + DAK) (1 + 0)1. Moreover, assuming that 
DAK v x 6 x 10*° (i.e. that the amount of 
antibody in the system is such that 'ambient assay' 
conditions prevail, then the surface density (D*) 
of developing-antibody molecules = CDKQ/[(6 
x 10 20 )(1 + Q)] approximately. It should be 
noted that D* is independent of both v and V, 
also that the ratio DVD = C x KQ/[(6 x 10 2o )(l 
+ Q )] = C x constant. 

If the minimum detectable surface density of 
developing-antibody molecules (i.e. o,^, the 
standard deviation of the measurement of D* 
when C = 0) is given by 5n (molecules/jim 2 ) 
and C min represents the minimum detectable 
analyte concentration in the test sample, then, 
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disregarding non-specific binding of developing 
antibody within the microspot area, 

C min = Z?; jn x [(6 x 10 20 )(1 + Q))IDKQ (4) 

For example, if Q = 1, D = 10 5 molecules/fim 2 , K 
= 10 n L/M and D* in = 20 molecules/^im 2 , then 
C min = 2.4 x 10 6 molecules/ml = 10" 15 M/L. It 
should be noted, in this example, the fractional 
occupancy of the sensing antibody binding sites 
by the minimum detectable analyte concentration 
is 0.04%. 

Case 2: anti-idiotypic antibody ('competitive') 
assay. In this case, we assume that, following 
removal of the sample, the microspot is exposed 
to a volume V(ml) of a solution of (for example) a 
second, labelled, anti-idiotypic antibody reacting 
with unoccupied sites on the sensing antibody. 
Using similar reasoning as above, we may 
likewise assume that the fraction of such sites 
which become occupied by the anti-idiotypic 
'developing' antibody is given by Q/(] -h 0), 
where Q is the developing-antibody concentra- 
tion. However, the minimum detectable surface 
density of anti-idiotypic antibody is not, in a 
competitive design, the critical determinant of 
assay sensitivity; this parameter is essentially 
governed by the precision of the density measure- 
ment. 

From Eq. (1), the fraction of sites unoccupied 
by analyte = 1/(1 + /?), and the fraction occupied 
by anti-idiotypic antibody = £?/(! + p){\ + Q). 
Thus, if the CV in the measurement of anti- 
idiotypic antibody is e, the standard deviation is 
zQ/(l + p)(] + Q). This term also represents the 
SD in the estimate of the fraction of sites occupied 
by analyte. Since the total number of antibody 
binding sites in the spot is DA, the SD in the 
estimate of occupied sites as p -* 0 (i.e. oD<*) 
approximates tDAQI{\ + Q)\ the SD in the 
occupied site surface-density estimate is thus 
iDQI(\ + Q). But the SD in the measurement of 
Pactional binding-site occupancy when p — » 0 
jefines D min , and hence the minimum detectable 
3nalyte concentration in the test sample as 
ndicated in Eq (4). 

Thus 

0™ = D min x [(6 x 10 20 )(1 + Q))IDKQ (5) 

= zDQI{\ + Q)± [(6 x 10 20 )(1 + Q)] 

DKQ (6) 

= z/K x (6 x 10 2 °) (7) 



For example, if values of Q = 1, D = 10 5 
molecules/fim 2 , and K = 10 M L/M are assumed as 
in the non-competitive example considered 
above, and the CV in the measurement of 
anti-idiotypic antibody density in the microspot is 
1% (i.e. £ = 0.01), then D mi = 500 molecules/ 
urn 2 , and C min =* 6 x ](r molecules/ml = 
10~ 13 M/L. Fractional occupancy of the sensing 
antibody binding sites by the minimum detectable 
analyte concentration is, in this example, 1%. It 
should be noted that the sensitivity limit of zlK 
(expressed in molar terms) is identical to that 
previously established for conventional Competi- 
tive' assays (Ekins and Newman, 1970), and 
which underlies the predictions represented in 
Fig. 4. 

Such considerations appear to suggest (a) that 
microspot assay sensitivities superior to those 
obtainable by conventional radioisotopically 
based immunoassays are achievable, and (b) that 
sensitivities yielded by non-competitive microspot 
assays are likely to be considerably greater than 
those of corresponding compexitive microspot 
assays. It must be emphasized, however, that, 
though such predictions are likely to prove 
correct, assumptions regarding the performance 
of the labels and signal-measuring instrument 
used are incorporated in the simple theoretical 
analysis discussed above. Such factors are clearly 
of importance in determining overall microspot 
immunoassay performance. 



Practical implementation 

The concepts discussed above are clearly exploit- 
able using a variety of antibody labels, including 
chemiluminescent labels; however, our prelimin- 
ary studies have been based on the use of 
conventional fluorophores, since the technology 
of simultaneous measurement of dual fluoresc- 
ence from small areas is already well established. 
Because this volume centres on chemiluminesc- 
ence, we shall provide only a brief indication of 
our initial experimental work in this area, which is 
currently based on the use of commercially 
available confocal microscopes. 

Instrumentation: the laser scanning confocal 
microscope. In laser scanning confocal fluoresG- 
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ence microscopy, a small area of the specimen is 
illuminated by a focused laser beam; the fluoresc- 
ence photons emanating solely from this area are, 
in turn, focused onto a photon detector. Both the 
intensity of illumination and the efficiency of light 
collection diminish rapidly with distance from the 
focal plane (Fig. 12). At the 'confocal' point, the 
projection of the illumination pinhole and the 
back-projection of the detector pinhole coincide. 
Such systems contrast with conventional epi- 
fluorescence methods, where the specimen is 
exposed to an essentially uniform flux of illumina- 
tion (White et aL, 1987). 

Sensitivity of current instruments. Typically, 
fluorescence photons emanating from the laser- 




Figure 12. Principle of the confocal microscope. Illuminating 
light is focused at a point in the focal plane. Reflected light 
from this point is focused onto a detector. A complete 
two-dimensional image of structures within the focal plane is 
obtained by scanning the selected area of interest, and may 
be stored in a microcomputer for video display 



illuminated area are detected by a low dark- 
current photomultiplier. Electrons spontaneously 
emitted by the photomultiplier photocathode 
contribute to the background signal of the 
instrument, and must, for highest sensitivity, be 
minimized. Fortunately the overall design of such 
instruments permits the photomultiplier photo- 
cathode to be of very small area, so that this 
particular source of background noise is not only 
small, but can be expected to reduce in relative 
importance with future improvement in photo- 
multiplier design. Meanwhile current instruments 
already display very high sensitivity of detection 
of fluorescent signals. For example, the confocal 
microscope manufactured by Zeiss is claimed to 
display a lower detection limit for fluorescein of 
about ten molecules/^m 2 (Ploem, 1986). Most 
commercially available FITC-labelled IgG attains 
a fluorophore/protein molar ratio of —4; thus the 
detection limit (D* in ) of the Zeiss microscope is 
-2-3 FITC-labelled IgG molecules/^m 2 . This 
implies an analyte-concentration detection limit 
of -2.4 x 10 5 molecules/ml for a two-site assay, 
assuming the same parameter values as used in 
the examples discussed above, or 2.4 x 10 4 
molecules/ml using a 'sensing' antibody of affinity 
10 12 L/M. ~ J 

Another comparable instrument is the Bio- 
Rad/Lasersharp laser scanning confocal micro- 
scope, which we are currently using in the 
development of 'ratiometric 1 multi-analyte assay 
methodology in accordance with the principles 
outlined above (see Fig. 13). The argon laser in 
this system possesses two excitation lines at 488 
and 514 nm. It is thus particularly efficient for the 
excitation of blue/green emitting fluorophores 
such as FITC (which displays an excitation 
maximum at 492 nm). However, it is considerably 
less efficient in the excitation of red-emitting 
fluorophores such as Texas red (excitation max- 
imum 596 nm). However, the ratiometric im- 
munoassay principle permits considerable varia- 
tion in detection efficiencies of the two labels 
relied on since, inter alia, the specific activities of 
the two labelled antibody species forming the 
antibody couplets can be chosen to yield optimal 
signal ratios in the region of unity. Thus 
inefficiency of the argon laser in exciting red 
emitting fluorophores is not necessarily a major 
handicap in the present context. 

Though the current Lasersharp instrument 
relies on a conventional microscope rather than a 
purpose-designed optical system (and appears to 
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Figure 13. Dual-channel confocal fluorescence microscope 
permitting simultaneous measurement of the fluorescence 
signals from two fluorophors situated at the focal point Bv 
scanning the antibody array, the ratio of signals from each 
antibody microspot may be determined 

be less sensitive), it permits quantification of 
fluorescence signals generated from microspots of 
selected area. Initial studies have revealed that 
under conditions that are not necessarily optimal' 
the instrument is capable of detecting approx- 
imately twenty-five FITC-labelled JgG molecules/ 
u.m , scanning an area of ~50u.m~ (Fig. 14) it 
must be stressed that neither of these confocal 
microscopes are designed specifically for routine 
ratiometric multi-analyte immunoassay use and 
it can be anticipated that future instruments 
constructed specifically for this purpose are likely 
to prove both cheaper and more sensitive. 

Other instruments. The MPM 200 Microscope 
Photometer manufactured by Zeiss of West 



Germany is anticipated to become available 
shortly. This photometer is claimed to be highly 
versatile: it can be used in transmission and 
reflection modes, and as a highly sensitive 
fluonmeter. The measuring field can be varied in 
shape and size for optimum adjustment to the 
specimen structure. More generally, the technol- 
ogy of sensitive light measurement is improving 
rapidly in response to needs in astronomy the 
space program etc., such technology clearly being 
readily exploitable in a multi-analvte immuno- 
assay context using light-generating labels in 
accordance with the broad principles presented 

Solid antibody supports. On the basis of the 
theoretical considerations discussed above it is 
evident that solid antibody supports for multi- 
analyte immunoassay use should display a capac- 
ity to adsorb a high surface density of antibody 
combined with low intrinsic signal-generating 
properties (for example, low intrinsic fluoresc- 
ence), thus minimizing background. We have 
examined a number of materials, including 
polypropylene, Teflon, cellulose and nitrocellu- 
lose membranes and microtitre plates (clear 
polystyrene plates from Nunc; black, white and 
clear polystyrene plates from Dynatech with- 
these criteria in mind. White Dynatech Micro- 
fluor microtitre plates, formulated specially for 
the detection of low fluorescence signals, yield 
high s.gnal-to-noise ratios and have therefore 

studies P ° nally USCd ° Ur devel °P me ntal 

Surface density of antibody coating. Preliminary 
experiments using Microfluor plates have re- 
vealed that it is possible to coat them with 
antibody at a surface density of at least 5 x 10 4 
IgG molecules/um 2 (Fig. 15). Moreover nearly all 
antibody molecules so deposited appear to retain 
immunological activity (Fig. 16). 



edification of the 'ratiometric' imunoassay con- 
cept. Our primary intention, in initial studies has 
been establishment of the basic conditions which 
usmg a particular instrument, can be anticipated 
on theoretical grounds to yield high assay 
sensitivity. Though the setting up of individual 
microspot immunoassays has thus appeared to us 
to be of secondary importance during the initial 
stages of our studies, we have nevertheless 
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Figure 16. Surface density of immunoreactive IgG molecules (number of molecules/urn 2 ) plotted as a function of the total 
surface density of IgG (number of molecules/um 2 ) on Dynatech Microfluor white microtitre plates 



thought it useful to confirm the validity of our 
general concepts by comparing the performance 
of certain assays when constructed in microspot 
format and when conventionally designed. For 
example, we have compared a dual-labelled 
tumour necrosis factor (TNF) ratiometric assay 
system using Texas red and FITC-labelled anti- 
bodies with an optimized 1RMA system using 
identical antibodies but with the second antibody 
,25 I-)abelled. Although unoptimized, the 
ratiometric microspot assay yielded formal sensi- 
tivity values closely approaching that of the 
conventional, optimized, 1RMA. Although 
verifying the general concepts underlying 
ratiometric microspot immunoassay methodolo- 
gy, further work is required to achieve the 
considerably greater sensitivity that theory pre- 
dicts as achievable using optimized reagent 
concentrations and improved instrumentation. 



CONCLUSION 

As indicated above, differentiation of the fluores- 
cent signals yielded by two fluorophores can be 
readily achieved solely on the basis of wavelength 
differences, and this approach has been relied on 
entirely in our preliminary studies. However, 



other physical techniques exploiting differences in 
decay time of two or more fluorescence emissions 
(using, for example, a pulsed or sinusoidally 
modulated laser source, and time- or phase- 
resolving detectors) are available, and can be 
expected both to further reduce background and 
to improve signal resolution, thus increasing assay 
sensitivity and precision. These considerations 
aside, the basic technology involved closely 
resembles that employed in domestic compact 
disk recorders and other similar data-storage 
devices, the obvious difference being that light 
emitted from each of the discrete zones forming 
the antibody-array is fluorescent rather than 
reflected, and yields chemical rather than physical 
information. Indeed, our preliminary studies 
suggest that highly sensitive immunoassays using 
antibody microspots of surface area approximat- 
ing 50fxm 2 are achievable, implying that some 
2,000,000 different immunoassays could, in prin- 
ciple, be accommodated on a surface area of 
1 cm 2 . Though non-specific binding of a multiplic- 
ity of developing antibodies would probably 
prohibit the use of antibody arrays of this order, it 
is evident that the technology is capable of 
encompassing analyte numbers of the kind likely 
to be useful in practice. 

The development of multi-analyte assay sys- 
tems of this kind can be anticipated to bring about 
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fundamental changes in medical diagnosis and 
many other biologically related areas. Systems 
capable of measuring every hormone and other 
endocrinologically related substance within a 
single small sample of blood are within technolo- 
gical reach, providing data which, when analysed 
with the aid of computer-based 'expert' pattern- 
recognition systems, are likely to reveal endoc- 
rine deficiences only dimly perceived using 
current 'single-analyte' diagnostic procedures. 
Such systems also provide a means to the 
development of a 'random access 1 immunoassay 
methodology, permitting the selection of any 
desired test or combination of tests from an 
extensive analyte menu. Clearly the accommoda- 
tion of a wide range of individual immunoassays 
on a small immunoprobe (comparable in its 
overall physical dimensions with a few drops of 
blood) is likely to totally transform the logistics of 
immunodiagnostic testing, and genuinely repre- 
sents, in our view, 'next generation' immunoassay 
methodology. 
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Multianalyte Microspot Immunoassay— Microanalytical "Compact Disk" of the Future 

R. P. Ekins and F. W. Cho 



Throughout the 1970s, controversy centered both on im- 
munoassay "sensitivity" per se and on the relative sensi- 
tivities of labeled antibody (Ab) and labeled analyle meth- 
ods. Our theoretical studies revealed that RIA sensitivities 
could be surpassed only by the use of very high-specific- 
activity nonisotopic labels in "noncompetitive 0 designs, 
preferably with monoclonal antibodies. The time-resolved 
fluorescence methodology known as delfia— devetoped in 
collaboration with LKB/Wallac— represented the first com- 
mercial "ultrasensitive" nonisotopic technique based on 
these theoretical insights, the same concepts being sub- 
sequently adopted in comparable methodologies relying 
on the use of chemiluminescent and enzyme labels. How- 
ever, high-specific-actrvity labels also permit the develop- 
ment of "muftianalyte" immunoassay systems combining 
ultrasensitive with the simultaneous measurement of tens, 
hundreds, or thousands of analytes in a small biological 
sample. This possibility relies on simple, albeit hitherto- 
unexploited, physicochemical concepts. The first is that all 
immunoassays rely on the measurement of Ab occupancy 
by analyte. The second is that, provided the Ab concentra- 
tion used is "vanishingly small," fractional Ab occupancy is 
independent of both Ab concentration and sample volume. 
This leads to the notion of "ratiometric" immunoassay, 
involving measurement of the ratio of signals (e.g., fluores- 
cent signals) emitted by two labeled Abs, the first (a 
"sensor" Ab) deposited as a microspot on a solid support, 
the second (a "developing" Ab) directed against either 
occupied or unoccupied binding sites of the sensor Ab. Our 
preliminary studies of this approach have relied on a 
dual-channel scanning-laser confocaJ microscope, permit- 
ting microspots of area 100 ^m 2 or less to be analyzed, 
and implying that an array of 10 6 Ab-oontaining microspots, 
each directed against a different anafyte, could, in princi- 
ple, be accommodated on an area of 1 cm 2 . Although 
measurement of such analyte numbers is unlikely ever to 
be required, the ability to analyze biological fluids for a wide 
spectrum of analytes is likely to transform immunodiagnos- 
tics in the next decade. 

Additional Keyphrmaet: radiometric immunoassays • scanning- 
laser con focal microscope * fiuoroimmunoas&ay 

Immunoassay and other protein-binding assay meth- 
ods based on the use of radioisotopic labels have played 
a major role in medicine during the past three decades. 
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Their utility and importance have derived primarily 
from the structural specificity of many reactions be- 
tween binding proteins and analytes and the detectabil- 
ity of isotopically labeled reagents, the latter endowing 
such techniques with "exquisite sensitivity." Recently, 
however, interest has increasingly focused on noniso- 
topic techniques based on identical analytical princi- 
ples, differing only in the nature of the marker used to 
label the reactant (e.g., antibody or antigen), whose 
distribution between reacted ("bound") and unreacted 
("free") fractions constitutes the assay ^response." 

The basic aims underlying this interest can be 
broadly classed under four main headings: 

• avoidance of the environmental, legal, economic, and 
practical disadvantages of isotopic techniques (e.g., lim- 
ited shelf life of isotopically labeled reagente, problems 
of radioactive waste disposal, cost and complexity of 
radioisotope counting equipment), particularly those 
impeding the development of, for example, simple diag- 
nostic kits for home or doctor's office use; 

• achievement of greater assay sensitivity; 

• "direct" measurement of analyte concentrations by 
use of transducer-based "immunosensors"; 

• simultaneous measurement of multiple analytee 
Cmultianalyte assay"). 

In this presentation I will focus primarily on the last 
of these objectives, using this to set out the principles 
underlying our present attempts to develop a new "min- 
iaturised" technology that will permit the simultaneous 
measurement of an unlimited number of analytes in a 
small biological sample such as a single drop of blood. 
However, retention (and, if possible, improvement) of 
the high sensitivities of conventional isotopic tech- 
niques is a basic aim not only of our own studies in this 
area but also of most other endeavore falling under the 
above headings. It is therefore appropriate to preface 
this paper with a discussion of the general principles 
underlying the attainment of high binding-assay sensi- 
tivity. 

Immunoassay Sensitivity: Some Basic Concepts 
Definition of Assay Sensitivity 

The need to establish assay conditions yielding max- 
imal sensitivity underlay the independent construction 
of mathematical theories of immunoassay design by 
both Yalow and Berson (1) and Ekins et al. (2) in the 
course of the original development of these methods in 
the early 1960s. Regrettably, these theoretical studies 

led to a prolonged controversy, arising largely from the 
conflicting concepts of "sensitivity" adopted by the two 
groups (see Figure 1). Briefly, Berson and Yalow, in 
their many publications relating to immunoassay de- 
sign (e.g;, J, 3), defined sensitivity as the 6lope of the 
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Rq. 1. The differing concepts of sensitivity and precision underlying 
radioimmunoassay design theories developed by (left) YaJow and 
Berson (e.g.. f, 3) and (right) Bans et al. (2, 4) 
YaJow and Berson define assay A as more sensitive because it yields a 
response curve of greater slope. Bona et al define assay 8 as more sensitive 
because the imprecision of measurement of zero dose (<r 0 ) is less YaJow and 
Berson likewise define an assay system as more precise tf it yields a steeoer 
response curve when data are plotted on a tog dose scale 

response curve relating the fraction or percentage of 
labeled antigen bound (b) to analyte concentration ([H]). 
In contrast, Ekina et al. (e.g., 2, 4) defined sensitivity as 
the (im)preciaion of measurement of zero dose, this 
quantity being indicative of, and essentially equivalent 
to, the lower limit of detection. 

The key difference between these two definitions 
clearly lies in the dependence of the assay detection 
limit on the error (imprecision) in the measurement of 
the response variable. By neglecting this crucial factor, 
the "response curve slope" definition leads to many 
obvious absurdities. For example, plotting conventional 
RIA data in terms of the response metameter B/F (i.e., 
the bound to free rstio) suggests that assay "sensitivity" 
is increased by increasing the antibody concentration in 
the system; however, the converse conclusion is reached 
if identical data are plotted in terms of F/B' (see Figure 
2). Observation of the shape and slopes of response 
curves without detailed error analysis thus constitutes a 
totally misleading guide to optimal immunoassay de- 
sign. This approach has, however, characterized many 
of the studies conducted in the immunoassay field dur- 
ing the past 30 years, and has been the source of much 




Any plot 




Response curve slope 



Defection limit 



Rg. 2. Schematic representation of RIA doae-reiponse curves 
observed for high and low antibody concentrations plotted in terms of 
[left) the free/bound fraction (F/B); (confer) the bound/free traction 

(B/F) 

Note that the tow antibody concentration yields a response curve of greater 
slope when the assay response is plotted h terms of F/B. but of tower slope 
when plotted in terms of B>F. The precision of measurement of zero dose 
(ACq) is Independent of the coordinate frame used to plot assay data (see 

right) 



mythology. For example, consideration of the Law of 
Mass Action reveals that, when response curves corre- 
sponding to different antibody concentrations are plot- 
ted in terms of b va [H], the maximal slope at zero dose 
is obtained for a concentration of 0.blK (where K is the 
affinity constant), in which circumstance the zero dose 
response (bo) is 33%. This conclusion led to Bereon and 
YaWs enunciation of the well-known dictum (which, 
albeit Erroneous, is broadly adhered to by many immu- 
noassay practitioners and kit manufacturers) that, to 
maximize RIA sensitivity, the amount of antibody to use 
in the system is that which binds 33% of labeled antigen 
in the absence of unlabeled antigen (1, 3). 

Disagreement regarding the concept of sensitivity 
inevitably led to prolonged dispute reganling immu- 
noassay design (5). However, although it is still common 
to encounter publications in the field that rely solely on 
the response curve slope as a measure of sensitivity, the 
assay detection limit is now widely accepted as the only 
valid indicator of this parameter, and we do not there- 
fore intend to dwell further on this issue here. It is 
nevertheless relevant to an understanding of the "min- 
iaturized" assay methodology described below to empha- 
size that untenable concepts of both sensitivity and 
precision underlie many of the commonly accepted rules 
governing current immunoassay-design practice, some 
of which are contravened in our own approach. 

Basic Immunoassay Designs 

It is likewise important in the present context to 
comprehend the basis of the various types of immunoas- 
says currently in use, and the constraints on the sensi- 
tivities of which they are potentially capable. T^e radio- 
immunoassay and analogous protein-binding assay 
techniques originally developed for the measurement of 
insulin by Yalow and Berson (6), and of thyroxin and 
vitamin B 12 by Ekins and Barakat (7, 8), relied on the 
use of a labeled analyte marker to reveal the products of 
the binding reactions between analyte and binder (Fig- 
ure 3, left). This approach has subsequently often been 
portrayed as relying on "competition" between labeled 
and unlabeled analyte molecules for a limited number of 
protein-binding sites, such assays being frequently re- 
ferred to as "competitive." 

Subsequently, Wide et al. in Sweden (9), followed 
shortly by Miles and Hales in the U.K. (JO), developed 
labeled antibody methods (Figure 3, right). These meth- 
ods represented an extension of the "labeled reagent" 
methods (utilizing radiolabeled organic compounds such 
as aai Mabeledp-iodosulfonyl chloride, [ 3 H]acetic anhy- 
dride, and other similar reagents) devised, during the 
early 1950s, by Keston et al. (J J), Avivi et al. (12), and 
others for quantifying amino acids, steroid and thyroid 
hormones, etc. Although radiolabeled antibody methods 
(immunoradiometric assays; irmas) were originally 
claimed (13) to be more sensitive than methods based on 
the use of radiolabeled analyte, these claims were sup- 
ported by neither rigorous theoretical analysis nor per- 
suasive experimental evidence, and for some time re- 
mained controversial. Further doubt on their validity 
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was cast by the publication by Rodbard and Weiss in 
1973 U4) of detailed theoretical studies demonstrating 
that both labeled analyte and labeled antibody methods 
possessed essentially equal sensitivities. (Note- These 
authors suggested that IRMAs might be more sensitive in 
the assay of small polypeptides, in which radioiodine 
incorporation into the antigen molecule was restricted- 
conversely, these assays would be less sensitive for the 
measurement of antigens of high molecular mass ) Nev 
ertheless, despite the appearance of this publication, the 
belief that labeled antibody methods per se are 
sically more sensitive than the corresponding labeled 
anaJytemethods gained wide acceptance among clinical 

The reason for confusion on this issue is that the 
greater potential sensitivity of certain assay formats is 
not really a consequence of the labeling of antibody as 
opposed to analyte; indeed, the apparent antithesis 
between labeled-analyte and labeled-antibody methods 
diverts attention from the true reasons underlying the 
superior sensitivity of certain assay designs. Theoretical 
analysis (see, e.g., 4, IS) reveals that, assuming "per- 
fect" separation of the products of the binding reaction 
(i.e., no misclassifi cation of bound and free moieties) the 
optimal antibody concentration (for maximal sensltiv- 
ity) in a labeled analyte immunoassay invariably tends 
to zero, irrespective of whether the free or bound labeled 
analyte fraction is measured, whereas in labeled-anti- 
body method the optimal antibody concentration de- 
pends on which labeled-antibody fraction is measured 
(see Figure 3). If the free (unreacted) antibody fraction is 
measured, the optimal concentration also tenda-to zero- 
conversely, if the analyte-bound fraction is measured,' 
the concentration tends to infinity. In short, of the four 
basic measurement strategies available— labeled ana- 
lyte, with measurement of free or bound reaction prod 
uct, and labeled antibody, also with measurement of 
free or bound product-only one permits, in practice the 
u** nf antibodv concentrations encroaching infinity 



This particular approach may, for want of a better term. 

LK^f although it murtfe 

emphaaued that such terminology involves a departure 
from the original meanings attached ^"competitive" 
and "noncompetitive" when these descriptions were SJt 
used m the present context Indeed, as discussed be \Z 
assays may be subclassed in this manner whenTo 
labeled reagent of any kind is involved 

However, the categorization of immunoassays and 
other binding assays as competitive or noncompetitive 
depending on the binding agent concentration yielding 
maxima] assay sensitivity, itself obscures the underK? 
mg reasons for the existence of this divergence in assay 
designs, and may thus be misleading. These reasons 
may be more readily understood if the basic principles of 
such assays are portrayed differently from their custom- 
ary presentation. 

The "Antibody Occupancy Principle" of Immunoassay 

When a "sensor" antibody is introduced into an ana- 
lyte-containong medium, binding sites on the antibody 

TJ U T? fer^ molecuJeB t<> a fractional extent 
that reflects both the equilibrium constant governing 
the binding reaction, and the final concentration offree 

Z^T*^ ** milture - ™* Preposition stems 
immediately from the Law of Mass Action, which can be 
written as 

[AbAgW£Ab] = infAg] (1) 

or as fractional occupancy of antibody bindine sites 
given by 

[AbAgKAb] = K(£Ag]/(l + K[£Ag]) (2 ) 

where (Ah Ag], [Ab], [£Ab], and [£Ag] represent the 
concentrations (at equilibrium) of bound and total anti- 
body, and free antibody and antigen (analyte), respec- 
tively, and JT = equilibrium constant. The final concen- 
fration offree analyte generally depends on the concen- 
trations of both total analyte and antibody; however 
when totel antibody approximates 0.05/JT or leas free" 
and total antigen ([Ag]) concentrations do not differ 
gW^ by* fr8Ctional ^Pancy of antibody i, 

(AbAgMAb) = K[Ag)/(l + K[Ag]) (3) 

Assays utilizing this concept have been termed "am- 
bient analyte immunoassays" (16), fractional occupancy 
being independent of both sample volume and antibodv 
concentration (see below). y 

J't^ «»ential]y depend on measure- 
ment of the fractional occupancy" of the sensor anti- 
body after its reaction with analyte (see Figure 4) 
Techniques relying on the measurement of imoUupied 
antibody binding sites (from which antibody ocSS 
u i unpbcitly deduced by subtraction) nece^SKor 
attainment of maximal sensitivity-the use of sensor 
antibody concentrations tending to zero; these assays 
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Fjp. 4. The antibody binding occupancy principle of Immunoas- 
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may therefore be categorized as "competitive." Coo- 
versely, techniques in which occupied sites are directly 
measured permit (in principle) the use of relatively hkh 
concentrations of sensor antibody and may be described 
as "noncompetitive." This difference in assayTsign 
sunply reflects the proposition that, to minimize enS in 
the measurement, it ia generally undeairable to mea- 
sure a small quantity by estimating the difference 
between two large quantities. 

™Z r-** "* iUUBtrated * 6, which 

portrays basic immunoassay formats currently in com- 
mon use. Conventional MA and other similar "labeled- 
analyte" techniques rely on measurement of JSJl 
binding sites, generally by back-titration (either simuj 
teneous or sequential) with labeled analyte, but anti- 
idiotypic antibody (reactive only with un£pieo rites 
on the sensor antibody) may be used for the same 
purpose. In the case of single-aite labeled-antibody as- 

T^L*? Z° ******* itself conatitutea the sensor 
antibody; after reaction with analyte, this sensor anti- 
body may be separated into occupied and unoccupied 
fractions through use of (e.g.) an immunosorbent (com- 
pming antigen, antigen analog, or anti-idiotypic anti- 
body Indeed to a solid support). If, after separation^ 
signal- emitted by teheled antibody boull to ^ 
d.e., the occupied" fraction) is measured directlv th! 
assay can be classed as "noncompetitive.- CoSmSJ if 
one measures the labeled antibody not bound to anaJy^ 
d.e., that attached to the immunosorbent), then the 
assay is "competitive.- 

Two-site "sandwich" assays are clearly more complex 
because they rely on two antibodies and can be TnS 
ered from two pointi, of view. For our present purposes 
the sohd-phase antibody can be regarded as th W 
antibody, with toe labeled antibody enabling the occu- 
pied sensor-antibody binding sites to be distinguished 
Seen from this viewpoint, two-site assays may be 
classed as ^noncompetitive." 

These considerations emphasize that the differences 
in design distinguishing so<alled competitive and non- 
competitive methods are essentially unrelated to which 
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Ftp. 5 Basic competitive and noncompetitive immunoassay destaru 

anttody « otrecflv meaaured/but are * 
unoccupied sites are measured L^t^nST;"!^ ..7, "&«) *hen 

unoccupted by analyte, and are therefore cA '^ompeti&Je^desJ^^ 

I C 3 nent f My) rf , reaction » "abeled. 

Indeed, in the case of transducer-based "immunosS 
sors no component is labeled; nevertheless, the desim 

o°n wLT Un ° 6enBOr **" ai ***<*»«y . depend 
on whether a measurable signal is yielded by occupied 
or unoccupied antibody binding sites situatedTn S 
surface In short, the terms "competitive- anc^oncon, 
petibve- merely reflect alternative approaches to toe 
determination of the occupancy of antib^ butdW 
sites and ead to differences in the optimal I antiS 
concentration required to ininimize the effects of rai 
dom errors arising in the determination 

Competitive a nd noncompetitive immunoassays can 
be shown to differ significantly in many of their SrfT 
mance characteristics, including thefr^LitiS in 
boto. types of assays, both the affinity constant 5)5 the 

tent in determining sensitivity; however, in practice 

t^TT^ tm T m ? ***** » Pri^ib 'limited 
by the affinity constant of the antibody, whereas toe 

apeofic actmty of the label is more important in not 

orT^t7l 8te r * M Experimental" 
or manipulation" error in the measurement of toe 
zenniose response (Ho) [i.e., the relative em£ <££ 
arising from pipetting and other operation^ & S 
including the statistical signal meaT"™^™ £ 
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ae] is of key importance in determining "potential" 
sassy sensitivity (i.e., the sensitivity obtained by assum- 
ing the specific activity of the lsbel to be infinite, 
implying zero error in signal measurement). Thus the 
potential sensitivity of s competitive assay can be 
shown to be ajJKRo, whereas that of a noncompetitive 
aBsay is given by RflOR/AbKRo, where, in the latter 
case, Ro is assumed to represent the labeled antibody 
misclassified as bound ([bAbJo), commonly referred to as 
"nonspecifically bound" antibody. Thus Jty[Ab] - f, the 
fraction of labeled antibody that is nonspecifically 
bound, and RoO^/lAbJCTo = fo^/tfnV Assuming that 
the relative error (<%fRJ in the measurement of the 
zero-dose response is approximately identical for both 
competitive and noncompetitive assays, it is evident 
from this simple analysis that the potential sensitivity 
of noncompetitive methods is greater than that of com- 
petitive methods by the factor f, i.e., by the fraction of 
labeled antibody that is "nonspecifically bound." For 
example, if the nonspecifically bound fraction is 0 01% 
a noncompetitive strategy is potentially capable of a 
sensitivity 10 000-fold greater than that of a competi- 
tive approach, other factors being equal. 

These findings are summarized in Figure 6 (left) 
which shows the relationships between sensitivity (ex- 
pressed in terms of molecules per milliliter) and anti- 
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Fig. 6. Theoretically predicted sensitivities of competitive and non- 
competitive immunoassay methods (represented by the SD of zero 
anslyte measurements, expressed as molecuies/mj.) cloned as a 
function of antibody affinity (/Q ' ^ 8 

Note: in noncompetitive sandwich euaya, the antibody affinity nfarmd m k 
M lot the tabeled antibody. In the company, m^SSStZ a£ue^ 
on *, ihm<*i that the e^^rtmema* am* (^to^^^S 
mem of the assay rasponea <••«., fradion ol labeled antioen bound) Is l%The 
TpoentiaJ eenattivOy- curve assumes the use or a la^ofinSha anJ£ 
a^lrr^thrtth.^inlheny^^^o,™^™^* 
The "*Hatml curve Micate. the bee m aenaliMy a^ 
error incum* hcountno «1 dbimasmbon, foTi «n^«^bnT?S? 
that H ueng entbodiee with an effMty < 1 o« l/mol (the mS^JS31-?5 
prmcttes). m, hemes* in lensftMty an be achieved by SSeUrttehi 
af»dSc activity than «L For noncompatffiva ^yt n^Z^^Z 
curve* shown relate to value) of nonspecific bininc c» Uibete7ina»oVrt^ 
iuppor curms) and 0.01* (lower curvs}, and emphasize the ImorcwwmL £ 
sensitivity potentially ettariafiie by mMmUng rwwpaSc Wnrf^TnT^ 
apondino ««a* ernes oanxUst, tf^n^Sf^St^S 
(compared with thai potentially attainable) when a radioisotopic marker k 
"••As* *• "P-*" •*-enti V e of nonieaiDpie label. ?<*m*a»Lz ' 
actMiy n noncompetitive assay designs (particularly it rocBpJXbirSnTis 
reduced to 0.1% or lass). Awws Indicate assay aerrtt^lT^v^ 5 ^ 
nonccmpetrtive immunoassay, based on ^^.S 
"erogenic (HsW(Sund radioactive « 
condusiona underlay the onpM development (78 oiCbS 
*uom^noas«y (oom). he first noniaotopic "ultr>eenX^^S2S 



body affinity in an optimized competitive (labeled ana- 
lyte assay For this analysis, we assume (o) the use of a 
label of infinite specific activity, and (6) the use of ia »I as 
a label, the radioactivity of the samples -being counted 
for 1 mm. Computations of the theoretically optimal 
reagent concentrations (on which calculations repre- 
sented in, Figure 6 rely) were baaed on the further 
assumptions that (c) the radioactivity of the antibody, 
bound labeled-analyte fraction was counted and (d) the 
(relative) "experimental error" component in the mea- 
surement of the bound fraction (cyb) was 1%. Given 
these assumptiona, the "potential" sensitivity attain- 
able m such an assay is oyjfb, where K is the affinity 
constant of the antibody. [For example, if the affinity 
constant is 10" L/mol, and c^b is 0.01 (1%), maximal 
away sensitivity is 10"" mol/L, or ~6 x io« molecules/ 
mLJ The additional "signal measurement error" arising 
in consequence of counting radioactive samples for a 
finite time implies a loss of assay sensitivity, as shown 
by the upper curve in Figure 6 (left). However, the 
resulting loss in sensitivity is relatively small for anti- 
bodies of affinities <10" L/mol, and is negligible for 
antibodies with affinities <10" L/mol. In other words if 
the essayist can accept individual sample counting 
times of 1-5 min, little improvement in eensitivityis 
gamed b/uring alternative labels of higher specific 
activities than 13ft I. However, similar considerations 
suggest that radioisotopic labels of much lower specific 
activity than »J (e.g., *H) may limit the eensitivitieeof 
the assays (such as steroid assays) in which they are 
used, notwithstanding the use of relatively long sample 
counting times. 

The other main conclusions stemming from such 
analysis are the importance of both minimizing "manip- 
ulation" errors and using antibodies of high bmdine 
sffinity. For example, an increase in <yb to 3% implies 
an approximate threefold loss in senaitivity, notwith- 
standing the fact that an assay reoptimized in response 
to the deterioration in operator skill that these numbers 
imply would utilize less antibody and labeled analyto 
thereby partially offsetting the consequences of poor 
pipetting. But the most important conclusion emerging 
from the analysis is the near impossibility, in practice 
of achieving immunoassay sensitivities better than' 
about 10 7 molecules/mL by using a competitive ap- 
proach, irrespective of the nature of the label used if one 
assumes sn upper limit to antibody binding affinities on 
the order of 10" L/mol. on 
The results of a similar analysis of the sensitivity 
mutebons applying to noncompetitive (two-site) assays 
US) are illustrated in Figure 6 (right). Two sets of 
curves are portrayed here, coiresponding to the assump- 
tions of 1% and 0.019b nonspecific binding of labeled 
antibody to the capture-antibody substrate. Such anal- 
ysis likewise yields important conclusions relevant to 
aasay design, e.g., the crucial importance of reducing 
nonspecific binding of labeled antibody to an absolute 
minimum. Furthermore, if nonspecific binding iTre- 
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by using an antibody of * « 10» Unci in an optimized 
nonwmpetotive assay design as by using an antibody of 
f=10 I^mDl in a competitive method One of the most 
important inclusions is tbat the sensitivities poten- 
£5? ttJ T bk antibodies UT >10» 

^ 0f "dioisotopically based 

more. In short, although, under certain circumstances 
noncompetitive IRMAs may be somewhat ■SEEKS 
*an correspondmg WA techniques (asaunungTe te 
of the aame antibody in each methodologyHhe pote^ 
tad advantages (vu^vis sensitivity) of the noncomS 

labels of much higher specific activity than »»I X 

aT«nS5 ^ 'ft a P^hen ft 

are combined with mgh-affinity antibodies: however 
Figure 6 demonstrates that, even with use rf.«SST* 
with affinities of about Vf^JJ^SSSSSZ 

These theoretical conclusions, together «rif>, fV l 

v.^ producbon of monoclonal antibodies (J) con*, 
tuted the basis of my laboratory's collahor*tivi a i 

unmunoassay methodology now iS^^SSa? 

fon^-^ 01087 the '^tra-sen^tive" 
nonisotopK immunoassay methodology to be developed 
The same basic approach has subsequently bV™ 
adopted by many other manufacturer^^ a variS 
of high-spedfic activity labels (Table 1) ^ 
gainst thu i background, let us now turn to the 
development of highly sensitive, niiniaturW W 
spot" unmunoassays and multianalyte assay systemT 



a^d^So; M,CrO^PO, " ,m ™ noa »«V-: Basic Concept. 

Ambient Analyte Immunoassay 

Particular attention has been drawn above to the 

ZT"« B 2£ thflt concentration appm! 

unating 0.57* is required to maximize the eeSfiKrf 
conventional labeled-antigen assays. Tto fa 
unphatly overturned by the development o?W 
unmunoassays, which we expect to provide 
s new generstion of binding assay methods. Bu Semre 



discussing this methodology in detail, another basic 
analytical concept must be examined ^ C 
The recognition that all unmunoaaaavB n 

sure ttjj™^, conceufrnrion i, tte meSTT.^ 

F 2 - F{(l/[Ab]) + (fAnl/IAbD + 1} + TAnJ/IAb] = q (4) 

where. [An] = analyte concentration fAhl - , B «^ 
concentration (both in units ofyjf) ?' l ~ J " 

system. (If, for example - io« tL!T y m *** 

binding^ € OB JS^J 0 1 Suk1!^^^ 
10"" mol/L. or fi 09 v X T represents 0.01 x 

^atrationofbmdinggitesofthe^n^ 
<0.01/iT, analyte deSoi^ !i antlbodle8 » 

<i» ««/7k ^ P . 111 m edium is invariably 
<1%. and the system is therefore effectively in^S 
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Tabte 1 J^ tecl,on Umti9 According to Type of Labe? 

7,5 x 1< P Reeled tfiolecules 
Determined by enzy^ "amplifica. 

'^^"en, per labeled 

M S!eSr abteevent5per,abew 



Enzyme label 

Chemiluminesqent label 
Fluorescem label 



the SHEIKS KStt ST ° f 1- 
a^y data. Th e tertw f AM «^?T , "Potation of binding 

caJ for oi/ antibodies if thk w^Tfl °? * «w identi- 

be identical for >y^ ^^l*J^ wil] 
antibody with an affinity in?' ll 0 Mo^S nt ? ,t,oa8 rf M 
wimanamnityoflO^JmoMO^mnW t ^ ? M UUbo ^ 
affiajty of 10-^ael, elided tfiil!? antibodj ' ^ u 
ejyreased in the same manaSr ^ concentration it 

sitas are exposed, not th^SSStSX^S ,l 5*: bin ^ 
, e - e ^ » dependent of sSet£ UMUbaU ° n tnbe = 
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Fig. 7. FractJonal antibody binding-site occunanni /c 
4) plotted as a function of antfb^binl^^ill^ eqU3b0n 
drfferenr values of analyte iBn^^l^^T^ !^ 
peas mage binding (b, of^to ^S^^r^ 

AD concermaiton* are nxpfesned In unto ol i//c . - 

conc.rar.tton. <0.01/K (approximately), th, ^^^L5*L * 
<1% tor all ana*. eooo W «™bona,13i EKffi^ * 8n>J) * 

eassassa "SaSHS?? 5 * 

Yatow and Benwn a* W 30%) ' " • cc ° r *"» wrth the pracepto « 

dent of sample volume. 

sites within the microspot is <vlK x lO" 6 x N w h^7, 
*d™ to which the microspot SSd 
On milliliters) and N = Avogadro'e numbS (6 x^f 
For example, if „ . , and K = X0 » iJK to 
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n^^umber of binding sites that will cause nes- 

buuhng sitae is solely dependent on the amWeTt £2? 
^on of analyte leads to the concept of a £fi£ 
"ratiometnc," microspot immunoassay. 

Dual-Label Microspot Immunoassay 

ure 8, left), the probe may be removed and exposed to ■ 

d.e., the occupied site) on the analyte bS«K 

chemilummescent markers (or even S ofSLfv 

over a surface (each microspot directed against a difiW 
«t analyte , so that multiple anaJyte^yVnWt 
performed simultaneously on the same samX Sverai 




Noncompetitive assay Competitive assay 

Ro. 8. Microspot Immunoassay; (te/f) first incuhaK 

concentration to which the mlcrospol has been exoc^w,^!? fnc6on * i occupancy of antibody blndlnn 
I ™<*ve with either ooc&d ^tion, In which the Nc^b^aX, X2S *° &nB ** 

- h th, m Incut**, . eon**,** erf c.^*T^^,^' 0f u ™~"P"d alt* (cornpS^.tP^ 10 8 *develo P ,n fl - 
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Fig. 8. Basic prinaple of duaJ-label, ambient anaJyte Immunoassay 
retying on fluorescent-labeled antibodies »»>nunoassay 
Thereto of o and p fluorescent photons emitted reflects the value m p 
fig. 7) end depende aoWy on the ana*, (weerrtret^^^^ll'f* 
been exposed. The ratio is unaWeaeXtSuirno^^^ T^L^ 
coated (as a tnonomotocuto bye,) oK^T^ °' 

advantages stem from adopting a dual fluorescence 
measurement. For example, neither the amount nor the 
distribution of the sensor antibody within the detector's 
field of view is important, because the ratio of the 
emitted fluorescent signals is unaffected. Likewise, fluc- 
tuate m the intensity of the incident (exciting) light 
beam are apt to be of little significance. Theseadvan- 
teges are additional to the basic benefit stemming from 
this approach, i.e.. that the necessity of ensuring con- 
stancy of the amount of sensor antibody used in the 
assay system is removed. 

Microspot Immunoassay Sensitivity 

Because the microspot immunoassay methodolotrv 
challenges concepts that have dominated immunoassay 
design theory in the past two to three decades, consid- 
eration of the potential sensitivity attainable' by this 
approach is obviously of primary importance. The prop- 
osition that microspot assays may be at least as sensi- 
tive as conventional systems that rely on far lareer 
amounts of antibody may readily be demonstrated by 
consideration of a model system. Let us postulate that 
sensor antibody molecules are attached to the surface of 
a solid support such that their binding sites remain 
exposed to the anaiyte, and that their affinity for the 
anaiyte is thereby unchanged. (The antibody concentra- 
tion in the eyetem-the number of binding sites on the 
support divided by the incubation volume— is unaffected 
by such attachment, and antibody occupancy by anaiyte 

22 T "? be J ideati « J to occurring if £ 
antibody is distributed uniformly throughout the incu- 
bation mixture.) Let us also suppose that the antibody 
molecules exist as a uniform monolayer. of maxima] 
surface density on the support and (to simplify discus- 
sion) are unlabeled. Then a change in the concentration 

of sensor antibody implies a corresponding change in 
the surface area over which the antibody is distributed 
If, for example, the antibody affinity constant is 10" 
lVmol, the total incubation volume is l mL, and the 
antibody surface density is 6000 binding sites//mi 2 then 



\^ ° f 1( * m (U " 01 accommodates 
antibody binding sites corresponding to a concentration 

t « Vn^ * 0 01 «>™Po»da to a concen 
trstion of 0 01/ff, etc. Let us further postulate that, after 
exposure of the sensor antibodies to a medium contain- 
ing anaiyte at a concentration of 0.01/ff «. e ., 6 x ID 7 
molecules^) we measure "noncompetitivelv- the re- 

^1S??? y < f Upancy (e -*- b * "Posureto a Z 
ond, labeled, "developing- antibody directed against *e 
anaiyte, forming a typical antibody sandwich). Finally 
let us oppose that all occupied sites react with the 

We may now consider the effects of a progressive 
rt£?££ e ^ce area froml^g.) 

I ^ibody concentration UK) through 

0.1 mm* (0 LK) to 0.01 mm* (0.01/K) and below. FW 
equation 4, the value of F for the 1 mm* area is 4.98 x 

2-i'Jr M J^ aU >™™ the number of anaiyte and 
labeled antibody molecules specifically bound to the 
area is 2.99 x 10* (i.e., about 50% of the total amuyte 
molecules present), whereas the number of labeled an- 
tabody molecules nonspecifically bound is 10*. Thus 
assuming the field of view of the detecting instrument is 
restricted to the area on which the sensorlnUhSy ■ ? 
deposited (see Figure 10o), and (proviaionally) asa^ung 

( °t r W0 rf the **™«t itseZoS 
«ro (,.e., the only source of background is the non- 
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specifically-bound labeled antibody within the instru- 
ment's field of view), the signal/noise ratio observed for 
the 1 mm 8 area is -30. Similarly, the value of Ffor a 0.1 
mm 2 area is 9.02 x 10"», the number of labeled anti- 
body molecules specifically bound to the area is 5.41 x 
10 6 , the number nonspecifically bound is 10 6 , and the 
signal/noise ratio is -54. Likewise, the signal/noise 
ratio for a 0.01 mm 2 area can be shown to be -59. In 
short, the signal/noise ratio increases as the antibody- 
coated surface area iB decreased, approaching a man- 
mal (plateau) value of 60 as the area coated with sensor 
antibody falls below 0.01 mm 2 and tends toward zero 

If, however, a reduction in the antibody-coated area 
were not accompanied by a corresponding reduction in 
the detecting instrument's field of view, the resulting 
reduction in "signal" would nor lead to a corresponding 
decrease in the background generated by nonspecifi- 
cally-bound developing antibody (Figure 105). There- 
fore, although reduction in the coated area would in- 
crease the fractional occupancy of the sensor antibody 
the signal/noise ratio might either remain constant or 
fall. In these circumstances it might be advantageous to 
increase the coated area. Similarly, if the surface den- 
sity of sensor antibody were decreased (the coated area 
being held constant), similar conclusions would be 
reached (Figure 10c). 

Likewise, if the background signal generated within 
the detecting instrument itself (e.g., from the photocath- 
ode of a photomultiplier tube used to detect photons 
emitted from the antibody-coated area) were not zero 
and remained constant regard] ess of the instrument's 
field of view, then a maximum signal/noise ratio would 
also be attained at some optimal value of the antibody, 
coated area, below which the ratio would fall Because 
however, one can generally reduce the size of the detector 
(and hence the detector-generated background) at the 
same rate as the size of the signal-emitting area, there is 
no reason— in principle— for the signal/noise 'ratio to 
diminish as the. antibody-coated area is progressively 
reduced toward zero. Thus if we. accept the signal/noise 
ratio as indicative of the precision of the measurement of 
antibody occupancy (and hence of assay sensitivity) 
these considerations suggest that it is advantageous to' 
reduce the antibody-coated surface area (and, concomi- 
tantly, the sensor-antibody concentration) toward zero 
although little advantage ia likely to accrue from reduc-' 
ing the area below 0.01 mm 2 (and thus the antibody 
concentration below 0.01/Jf). 

Were the microspot area indeed reduced to zero, both 
signal and noire would likewise also fall to zero (the 
ratio between them nevertheless remaining essentially 
constant), implying that no signal of any kind would in 
the limit, be recorded. In practice, other statistical 
factors come into play when the number of individual 
events (e.g., photons) observed by a detecting instru- 
ment is very low, thus prohibiting a reduction of the 
sensor antibody concentration to zero. The point at 
which the reduction in the antibody-coated areacauses 
tfc. A^hU. «;™| u, be lost aufficientlv to affect the 



precision of the measurement of antibody occupancy 
depends clearly on the specific activity of the labeled 
antibody used to measure the occupied binding sites- the 
higher the specific activity, the smaller the-penniaaible 
area. Thus, given labels of very high specific activity 
one can envision circumstances in which, even in a 
"noncompetitive" system, the optimal concentration of 
sensor antibody may be exceedingly low. A more gen- 
era! conclusion is that s variety of factors, including the 
characteristics of the instruments used for measuring 
the labeled antibody (or labeled analyto), influence 
immunoassay design, implying, among other things, the 
virtual impossibility of formulating general rules re- 
garding this. For example, reagent concentrations that 
are optimal for isotopically labeled reagents used with a 
conventional radioisotope counter (possessing a fixed 
background dependent on its basic construction) are 
likely to be entirely different when very high-specific, 
activity labels are used and one has the freedom to tailor 
the measuring instrument to samples of any size. In 
short, certain conclusions based on experience of RIA 
and IRMA techniques may prove misleading when an- 
Phed to nonisotopic methodologies, and should be 
viewed with caution. 

A more detailed theoretical consideration of (noncom- 
petitive) microspot immunoassay sensitivity (21) sug- 
gests that ^ 

= 2>*buo x [(6 x lO^Kl + [Ab*])]/DKTAb»] (5) 



= surface density (binding siteaW) of sensor 
antibody, K = sensor antibody affinity (brnol) [Ab*l = 
concentration of labeled antibody in developing solution 
(expreEsed in unite of UK*, where K* = labeled antibody 

IrnS?Li , "ft.", minimum detectable surface density 
of labeled anbbody (moleculeaW), and C • = assav 
detection limit (moleculea/mL). For example/If [Ab*l «• 
ok = , 10 > ol «^Aun 2 , K - 10" L/mol, and!)*/ = 
20 molecules/^ 2 , then - 2.4 x iff moleculeSml 
" V 10 . moVL *** fractional occupancy of the 
bm&ngsites of the sensor antibody by the minimum 
detectable concentration of analyte is 0.04%. Figure 11 
shows the theoretical assay sensitivities attainable with 

Z^Td^" of various affinities - p ]otu * *» « 

A similar theoretical analysis of competitive micro- 
spot immunoassay indicates that potential sensitivities 
are essentially identical to those attainable with con- 
ventional competitive methodologies. In summary the 
above considerations indicate that the attainment of 
high microspot assay sensitivity requires close packing 
of molecules of sensor antibodies within the microspot 
area, combined with the use of an instrument capableof 
accurately measuring very low surface densities of de- 
velopmg antibodies. They also suggest that (a) micro- 
^ lfavit ^»asiderably higher than those 
obtainable by conventional isotopically based immu- 
noassays are achievable, and (5) if labels of very birii 
specific activity are available, the sensitivities yielded 
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to diminish with future improvement in photomultiplier 
design. Other sources of background include fluores- 
cence emitted by components in the optical system, 
which may not, in current instruments, have been 
constructed with background reduction as a prime con* 
sideration. Nevertheless, they detect with high sensitiv- 
ity fluorescent Bignals. For example, one commercially 
available microscope is claimed to detect fluorescein at a 
density of 10 molecules/^m 2 . Most commercially avail- 
able fluorescein isothiocyanate (fTTO-labeled IgG ex- 
hibits a fluorophor/protein ratio of -4; this implies 
detection limit (D*^) for antibody surface density of 
two or three FITC-labeled IgG molecules per microme- 
ter 2 . ThiB, in turn, implies a theoretical sensitivity for a 
two-site immunoassay of -2-3 x 10 5 analyte molecules 
per milliliter, assuming identical parameter values as 
above, or 2-3 x 10 4 molecules/mL if the sensing anti- 
body has an affinity of 10" L/mol. Clearly, sensitivity 
may be increased by loading more fluorophor either 
directly or indirectly onto the antibody. 

Our preliminary studies have relied on a less sensi- 
tive microscope, albeit one possessing facilities for dual- 
fluorescence measurement Its argon laser emits two 
excitation lines at 488 and 514 nm. It is thus particu- 
larly efficient in exciting blue/green-emitting fluoro- 
phores such as FITC (excitation maximum 492 nm), but 
is less efficient in exciting fluorophores such as Texas 
Red (excitation maximum 596 nm). However, the ratio- 
metric assay principle permits considerable variation in 
detection efficiencies of the two labels because the spe- 
cific activities of the labeled antibody species forming 
the antibody couplets can be chosen to yield signal 
ratios approximating unity. Inefficiency of the argon 
laser in exciting Texas Red is thus not a major handicap 
in this context. Though this instrument relies on a 
conventional microscope and not on an optical system 
designed for this purpose (and thus implicitly less sen- 
sitive), it permits quantification of fluorescence signals 
generated from microspots of any selected area. Initial 
studies have revealed that, under conditions that are 
not optimal, the instrument is capable of detecting -25 
FITC-labeled and (or) 150 Texas Red-labeled IgG mole- 
cules per micrometer 5 , while scanning an area of -50 
/an*. 

The development of microepot immunoaasayB has also 
necessitated closer scrutiny of the mechanisms involved 
in the coupling, of antibodies to solid supports. In the 
present context, these should display a capacity to 
adsorb (in the form of a monolayer)—-or to covalently 
link — a high surface density of antibody combined with 
low intrinsic-signal-generating properties (e.g., low in- 
trinsic fluorescence), thus minimizing background. We 
have examined a number of candidate materials, such 

as polypropylene, Teflon*, cellulose and nitroceDulose 

membranes, microtiter plates (clear polystyrene plates; 
black, white, and clear polystyrene plates), glass slides 
and quartz optical fibers coated with 3-(amino propyl) 
triethoxy silane, etc, and several alternative protocols 
for achieving high monolayer coating densities. These 
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studies have exposed phenomena neither evident nor of 
importance when antibody binding to solid supports is 
examined at a macroscopic level. Provisionally, we have 
used white Dynatech Microfluor microtiter plates 
formulated for the detection of low fluorestence signals, 
and yielding high signal/noise ratios and high coating 
densities of functional antibodies (-5 x 10 4 IgG mole- 
cules/jun^-for assay development, although such 
plates are not ideal. Indeed, deficiencies in the antibody- 
deposition methods used constitute the principal source 
of imprecision in assay results and the limitation in 
sensitivity that this implies. Clearly, this represents an 
area for further study and refinement of current coating 
techniques. 

Notwithstanding the limitations of present instru- 
mentation (which, among other things, doeB not permit 
the use of time-resolving techniques to distinguish two 
individual fluorescence signals either from each other or 
from background fluorescence) and the crudenese of 
present methods for coupling antibodies onto small 
areas, we have verified the theoretical concepts outlined 
above by comparing the performance of several assays 
when constructed in microepot format and when conven- 
tionally designed. Although unoptimized, ratiometric 
microspot assays have yielded sensitivity values closely 
approaching those of conventional optimized IRMA. As 
an example, the results of a ratiometric assay system for 
thyrotropin, with use of Texas Red- and FITC-labeled 
antibodies, are shown in Figure 13. Bearing in mind the 
well-known limitations of these and other "convention- 
al" fluorophors when used as immunoassay reagent 
labels, such results are encouraging, although farther 
work is clearly required to achieve the considerably 
greater sensitivity theoretically predicted with use of 
improved fluorophors, better antibody-mi croapotting 
techniques, and purpose-built (time-resolving) instru- 
mentation. 

The finding that highly sensitive immunoassays can 
be performed with far smaller amounts of antibody than 

2C0 T 
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Fig. 13. Response curve in a dual-labeled microspot ratiometric 
assay of thyrotropin (TSH) with Texas Red-labeled solid-phase 
capture antibody and a developing antibody labeled with biotirV 
FITC-evidin 

The FJTC/TexM R*d ratio tor mterospot was measured with a scanning 
eontoc«J mkroocope, aryj pk*«d *s a function of TSH concentration in 
milfi-tnt. unita/L 



FROM BIOMEDICAL INFORMATION SERVICE 



(WED) 2.19'03 1 1 :36/8T. Ms 23/NO. 4862209286 P 17 



are currently used conventionally permits in turn the 
construction of antibody microspot arrays enabling, in 
principle, the simultaneous measurement of thousands 
of different substances in 1-xnL samples. In collabora- 
tion with investigators at the Centre for Applied Micro- 
biological Research, Porton Down, U.K., we are pres- 
ently developing various techniques for the creation of 
such arrays. Indeed, similar technologies have recently 
been used for the parallel synthesis of several different 
polypeptides, these enabling 10 000- microspot arrays to 
be constructed on silica chips approximating 1 cm 2 (24). 
Although arrays of this capacity are unlikely. to ever be 
required for conventional diagnostic purposes, we can 
anticipate that the ability to simultaneously measure 
many substances in the same sample will have revolu- 
tionary consequences in medicine and other similar 
areas. In addition, such techniques may ultimately 
permit the individual analysis of the multiple isoforms 
of certain "heterogeneous" analytes (e.g., the glycopro- 
tein hormones), such molecular heterogeneity currently 
presenting a major obstacle to the standardization and 
interpretation of many immunological measurements 
(25). Moreover, although these concepts have been illus- 
trated in an immunoassay context, they are clearly 
applicable to all binding assays," including those rely- 
ing on the use of DNA probes,, hormone receptors, etc. 
For example, labeled lectins that are specific in their 
reactions with the sugar residues in the oligosaccharide 
chains of glycoprotein molecules may be used, together 
with specific antibodies, to impart additional "structural 
specificity* to sandwich assays (26, 27), possibly over- 
coming the limitations of antibodies per se in regard to 
differentiation of the glycoeylation variants of the gly- 
coprotein hormones. 

Summary and Conclusion * J 

Because of past confusion regarding the concepts of 
precision, sensitivity, accuracy, etc., several erroneous 
concepts have become incorporated within currently 
accepted rules of immunoassay design. In particular, 
much higher antibody concentrations are customarily 
used than are necessary to achieve very high assay 
sensitivity, provided that certain measurement strate- 
gies are adhered to. In this presentation, we have 
attempted to show that, in principle, the highest assay 
sensitivities are obtained by confining a small number 
of sensor antibody molecules onto a very small area in 
the form of a microspot and measuring their occupancy 
by an analyte, by using very high-specific-activity "de- 
veloping" antibody probes, thereby maximizing the sig- 
nal/noise ratio in the determination of sensor antibody 
occupancy. This observation, which contradicts cur- 
rently accepted immunoassay design theory, in turn 
makes possible the measurement of an unlimited num- 
ber of different analytes on a chip of very small surface 
area through the use of, e.g., laser scanning techniques 
closely analogous to those used in compact disk tech- 
niques of sound recording. Extensive experimental stud- 
ies in this area, albeit conducted with relatively crude 
techniques and instrumentation not specifically de- 



signed for these purposes, and therefore not reported in 
detail here, have demonstrated the feasibility of the 
miniaturized antibody microspot approach and the va- 
lidity of the general concepts on which it is based We 
are therefore confident that this represents the basis of 
a next-generation technology that is likely to have a 
revolutionary impact on ail fields involving the use of 
binding assays. 
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Corrections 



Vol 37, pp. 1447-8: In our desire for rapid publication, 
important errors were introduced into the following 
Technical Brief. The corrected version is here repro- 
duced in its entirety, with our apologies to the authors. 

Rapid Detection of 171 7-1 G~» A Mutation in CFTR Gene 
by PCR-Mediated Site-Directed Mutagenesis, Lauru 
Cremoncsi,' Manuela Seiaf Carmelina Mognani,* and 
Maurizio Ferrari* ( 1 Istituto Scientific© H.S. Raflaele, 
Lab. Centrale, Milano; 2 Istituti Clin, di Perfeaonamento, 
Lab. di Ricerche Clin., Milano, Italy) 

Until now, among the non-AF508 mutations identified in 
the cystic fibrosis transmembrane conductance regulator 
(CFTR) gene by the Cystic Fibrosis (CF) Genetic Analysis 
Consortium, the ones most frequently seen in our popula- 
tion sample are the 1717-1G-*A mutation (13/144 or 9% of 
the CF chromosomes) and the G542X mutation (16/190 or 
8.4% of the CF chromosomes), both revealed by dot-blot 
hybridization of the polymerase chain reaction (PCR) prod- 
uct with allele-specific oligonucleotides (ASO) probes (J). 

In an attempt to simplify t he analysis of the most 
frequent mutations in the CFTR gene, we converted radio- 
labeled ASO detection into restriction endonuclease anal- 
ysis of the amplified product 

A PCR-mediated site-directed mutagenesis (2, 3) to de- 
tect the G542X mutation by generating a novel BstNl site 
in the wild-type sequence had already been suggested (4). 

To detect the 1 71 7-1 G-* A mutation, we designed the 
reverse primer (5 '-CTCTGC AAA CTTGG AG A(^TC-3 ') to 
contain a single-base mismatch (T-+G), which Tould create 
a novel A will restriction site (G 4 G(A/DCC] in the am- 
plified wild-type (WT) allele but not in the CF mutant CM) 
allele: 



WT: WT 1717 



5* 



TAGGACA GCAGAG 

..CGTCTC 



3' 



AT 3CTGft 



A will site 



M 1717 
i 



5' 



TAAGACA. GCAGAG 



>5' 



3' 



3' 



VTTCTGfr 



.CGTCTC 




*. mutarenked base of reverse primer 



5' 



Fig. 1. Detection of the 1717-1 A mutation by PCR 
Reactions were earned out with 1 mo of genomic DMA In a total volume of 100 
ML containing 10 mmoU Tris-HCJ (pH 6.3), SO mmol/L KCi, 1.6 mmot/L 
MpAr, 0.1 pA gelatin, 200 /imoVL each of the four ce<wyribomjdeot>0e 
triphosphates. ZB unto of Teq polymerase (PertdrvEknsr Certus, Norwak. 
CT), and 100 pmol of each of the primera. PCR conditions were as toflows: 
denaturation al WC for 1 min. anneaiino at 55 "C for 30 a, and extension al 
72 *C fori min,foTatotalof30cyc4ea.PCR products were digested lor 2 h at 
37 X with 5 U oMvril and eleorophofesed on 3% Bgaroae-1% NuSleve gel 
lor 1 h ai 50 V. Bands were made visible by staining the get with ethtdium 
bromide. Lane 1: ttaetlkflgested p6R322 size marker. Lane 2: normal 
homorygote. Lane 3: CF patient homozygous for me 1717-1 A mutation. 
La/»4:heterc*yoote carrier lor the 1717-1G— A mutation 



For the forward primer, we used the one made available 
by the CF Gene tic Analysis Consortium to amplify exon 11 
of the CFTR gene: 5 '-CAACTGTGGTTAAAGCAAT- 
AGTGT-3'. 

Digestion by A vaU enzyme of the PCR product generates 
two fragments of 116- and 21-bp in the wild-type alleles 
and leaves undigested a 137-bp fragment in the mutant 
alleles (Figure 1). 

By combined analysis for the AF50S mutation (5) (252/ 
470 or 53.6% of the CF chromosomes), 1717-1G->A, and 
G542X, about 71% of mutations might be detected by 
nonisotopic analysis of the PCR product, thus allowing a 
faster and easier one-day procedure for carrier screening 
and prenatal testing. 
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DECLARATION OF VISHWANATH R. IYER, Ph.D. 
UNDER 37 C.F.R. § 1,132 

I, VISHWANATH R. IYER, Ph.D., declare and state as 

follows : 

1. I am an Assistant Professor in the Section of 
Molecular Genetics and Microbiology, Institute of Cellular and 
Molecular Biology, University of Texas at Austin, where my 
laboratory currently studies global transcriptional control in 
yeast, gene expression programs during human cell 
proliferation, and genome-wide transcription factor targets in 
yeast and human. Immediately prior to this position, I spent 
four years as a postdoctoral fellow in the laboratory of 
Patrick 0. Brown at Stanford University studying the 
transcriptional programs of yeast and of human cells. My 
curriculum vitae is attached hereto as Exhibit A. 

2. Beginning in Dr. Brown 1 s laboratory, where I 
helped to develop the first whole genome arrays for yeast and 
early versions of highly representative cDNA arrays for human 
cells, and continuing to the present day, I have used 
microarray-based gene expression analysis as a principal 
approach in much of my research. 

3. Representative publications describing this 
work include: 



DeRisi J. et al . , "Exploring the metabolic and 
genetic control of gene expression on a genomic 
scale," Science 278:680-686 (1997); 1 

Marton et al., "Drug target validation and 
identification of secondary drug target effects 
using DNA microarrays, " Nature Med. 4:1293-1301 
(1998) ; 2 

Iyer et al . , "The transcriptional program in 
the response of human fibroblasts to serum, " 
Science 283:83-87 (1999) ; 3 and 

Ross et al . , "Systematic variation in gene 
expression patterns in human cancer cell lines, " 
Nature Genetics 24: 227-235 (2000) . 4 

Two of the papers describe our use of microarray-based 
expression profiling to explore the metabolic reprogramming 
that occurs during major physiological changes, both in yeast 
(DeRisi et al . , during the shift from fermentation to 
respiration) and in human cells (Iyer et al . , human 
fibroblasts exposed to serum) . One reference describes our 
use of expression profile analysis in drug target validation 
and identification of secondary drug effects (Marton et al.). 
And one describes our use of expression profiling as a 
molecular pheno typing tool to discriminate among human cancer 
cells (Ross et al.). 

4. Whether used to elucidate basic physiological 
responses, to study primary and secondary drug effects, or to 
discriminate and classify human cancers, expression profiling 



Attached 
Attached 
Attached 
Attached 



hereto as 
hereto as 
hereto as 
hereto as 



Exhibit B. 
Exhibit C. 
Exhibit D. 
Exhibit E. 



as we have practiced it relies for its power on comparison of 
patterns of expression. 



5. For example, we have demonstrated that we can 
use the presence or absence of a characteristic drug 
"signature" pattern of altered gene expression in drug- treated 
cells to explore the mechanism of drug action, and to identify 
secondary effects that can signal potentially deleterious drug 
side effects. As another example, we have demonstrated that 
gene expression patterns can be used to classify human tumor 
cell lines. While it is of course advantageous to know the 
biological function of the encoded gene products in order to 
reach a better understanding of the cellular mechanisms 
underlying these results, these pattern-based analyses do not 
require knowledge of the biological function of the encoded 
proteins. 

6. The resolution of the patterns used in such 
comparisons is determined by the number of genes detected: the 
greater the number of genes detected, the higher the 
resolution of the pattern. It goes without saying that higher 
resolution patterns are generally more useful in such 
comparisons than lower resolution patterns. With such higher 
resolutions comes a correspondingly higher degree of 
statistical confidence for distinguishing different patterns, 
as well as identifying similar ones. 

7. Each gene included as a probe on a microarray 
provides a signal that is specific to the cognate transcript, 
at least to a first approximation. 5 Each new gene-specific 



In a more nuanced view, it is certainly possible for a probe to 
signal the presence of a variety of splice variants of a single gene, 

(Continued...) 



probe added to a microarray thus increases the number of genes 
detectable by the device, increasing the resolving power of 
the device. As I note above, higher resolution patterns are 
generally more useful in comparisons than lower resolution 
patterns. Accordingly, each new gene probe added to a 
microarray increases the usefulness of the device in gene 
expression profiling analyses. This proposition is so well- 
established as to be virtually an axiom in the art, and has 
been as long as I have been working in the field, and 
certainly since the time I embarked on the production of whole 
genome arrays in early 1996. Simply put, arrays with fewer 
gene-specific probes are inferior to arrays with more gene- 
specific probes. 

8. For example, our ability to subdivide cancers 
into discriminable classes by expression profiling is limited 
by the resolution of the patterns produced. With more genes 
contributing to the expression patterns, we can potentially 
draw finer distinctions among the patterns, thus subdividing 
otherwise indistinguishable cancers into a greater number of 
classes; the greater the number of classes, the greater the 
likelihood that the cancers classified together will respond 
similarly to therapeutic intervention, permitting better 
individualization of therapy and, we hope, better treatment 
outcomes . 

9. If a gene does not change expression in an 
experiment, or if a gene is not expressed and produces no 



(...Continued) 

without discriminating among them, and for a probe to signal the presence 
of a variety of allelic variants of a single gene, again without 
discriminating among them. 



signal in an experiment, that is not to say that the probe 



insufficient number of conditions have been sampled to 
identify expression changes. In fact, an experiment showing 
that a gene is not expressed or that its expression level does 
not change can be equally informative. To provide maximum 
versatility as a research tool, the microarray should 
include — and as a biologist I would want my microarray to 
include — each newly identified gene as a probe. 



herein of my own knowledge are true and that all statements 
made on information and belief are believed to be true, and 
further that these statements were made with the knowledge 
that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under 
Section 1001 of Title 18 of the United States Code and may 
jeopardize the validity of any patent application in which 
this declaration is filed or any patent that issues thereon. 



lacks usefulness on the array; it only means that an 



10. I declare further that all statements made 
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Austin, TX 78712-0159 

Phone: 512-232-7833 

Fax: 512-232-3432 

Email: vishy@mail.utexas.edu 

Education/Training 

Bombay University, Mumbai, India B.Sc. (1987), Chemistry & Biochemistry 

M. S. University of Baroda, Baroda, India M.Sc. (1989), Biotechnology 

Harvard University, Cambridge MA Ph.D. (1996), Genetics 

Stanford University, Stanford CA Post-doctoral (1996-2000), Genomics 

Research Experience 

9/00-5/03 Assistant professor, Section of Molecular Genetics and 
Microbiology, University of Texas, Austin TX 

■ Global transcriptional control in yeast 

" Gene expression programs during human cell proliferation 

■ Genome-wide transcription factor targets in yeast and human 

■ Collaborative microarray facility 

5/96-8/00 Post-doctoral fellow Stanford University, Stanford CA 
(Advisor: Dr. Patrick O. Brown) 
• Yeast whole-genome ORF and intergenic microarrays 

■ Human cDNA microarrays for expression profiling 

9/89-4/96 Graduate student Harvard University, Cambridge MA 

(Advisor: Dr. Kevin Struhl) 

■ Yeast transcriptional regulation 



Honours and Awards 

Government of India Biotechnology Fellowship (1987-1989) 
University Grants Commission Junior Research Fellowship (1989) 
Stanford University/NHGRI Genome Training Grant (1996) 

Invited Conference talks (selected) 

Invited Lecturer, NEC-Princeton Lectures in Biophysics 
Princeton, NJ (June 1998) 
. Plenary Session Speaker, HGM '99 (HUGO Human Genome Meeting) 
Brisbane, Australia (April 1999) 
Invited Speaker, Gordon Research Conference "Human Molecular Genetics 
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Invited Speaker, Nature Genetics "Oncogenomics 2002" Conference 

Dublin, Ireland (May 2002) 
Invited Speaker, "Pathology Bioinformatics" Symposium, University of Michigan, 
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Invited Speaker in Functional Genomics (Gene Networks) Symposium, International 

Congress of Genetics, Melbourne Australia July 6-11 2003 
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Reviewer for Genome Biology, Genome Research, Nature Genetics, Science (1998- 
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Member, NIDDK Special Emphasis Review Panel ZDKi (2001-2002) 
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1. Iver V . & Struhl, K. (1995) Poly(dA:dT), a ubiquitous promoter element that 

stimulates transcription via its intrinsic DNA structure, EMBO J. 14: 2570-2579. 
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Exploring the Metabolic and Genetic Control of 
Gene Expression on a Genomic Scale 

Joseph L DeRisi, Vishwanath R. Iyer, Patrick O. Brown* 

DNA microarrays containing virtually every gene of Saccharomyces cerevisiae were used 
to carry out a comprehensive investigation of the temporal program of gene expression 
accompanying the metabolic shift from fermentation to respiration. The expression 
profiles observed for genes with known metabolic functions pointed to features of the 
metabolic reprogramming that occur during the diauxic shift, and the expression patterns 
of many previously uncharacterized genes provided clues to their possible functions. The 
same DNA microarrays were also used to identify genes whose expression was affected 
by deletion of the transcriptional co-repressor TUP1 or overexpression of the transcrip- 
tional activator YAP1. These results demonstrate the feasibility and utility of this ap- 
proach to genomewide exploration of gene expression patterns. 



The complete sequences of nearly a dozen 
microbial genomes are known, and in the 
next several years we expect to know the 
complete genome sequences of several 
metazoans, including the human genome. 
Defining the role of each gene in these 
genomes will be a formidable task, and un- 
derstanding how the genome functions as a 
whole in the complex natural history of a 
living organism presents an even greater 
challenge. 

Knowing when and where a gene is 
expressed often provides a strong clue as to 
its biological role. Conversely, the pattern 
of genes expressed in a cell can provide 
detailed information about its state. Al- 
though regulation of protein abundance in 
a cell is by no means accomplished solely 
by regulation of mRNA, virtually all dif- 
ferences in cell type or state are correlated 
with changes in the mRNA levels of many 
genes. This is fortuitous because the only 
specific reagent required to measure the 
abundance of the mRNA for a specific 
gene is a cDNA sequence. DNA microar- 
rays, consisting of thousands of individual 
gene sequences printed in a high-density 
array on a glass microscope slide (1, 2), 
provide a practical and economical tool 
for studying gene expression on a very 
large scale (3-6). 

Saccharomyces cerevisiae is an especially 

Department of Biochemistry, Stanford University School 
of Medicine. Howard Hughes Medical Institute. Stanford. 
CA 94305-5428. USA. 

•To whom correspondence should be addressed. E-mail: 
pbrown@crngm.stantord.edu 



favorable organism in which to conduct a 
systematic investigation of gene expression. 
The genes are easy to recognize in the ge- 
nome sequence, cis regulatory elements are 
generally compact and close to the tran- 
scription units, much is already known 
about its genetic regulatory mechanisms, 
and a powerful set of tools is available for its 
analysis. 

A recurring cycle in the natural history 
of yeast involves a shift from anaerobic 
(fermentation) to aerobic (respiration) me- 
tabolism. Inoculation of yeast into a medi- 
um rich in sugar is followed by rapid growth 
fueled by fermentation, with the production 
of ethanol. When the fermentable sugar is 
exhausted, the yeast cells turn to ethanol as 
a carbon source for aerobic growth. This 
switch from anaerobic growth to aerobic 
respiration upon depletion of glucose, re- 
ferred to as the diauxic shift, is correlated 
with widespread changes in the expression 
of genes involved in fundamental cellular 
processes such as carbon metabolism, pro- 
tein synthesis, and carbohydrate storage 
(7). We used DNA microarrays to charac- 
terize the changes in gene expression that 
take place during this process for nearly the 
entire genome, and to investigate the ge- 
netic circuitry that regulates and executes 
this program. 

Yeast open reading frames (ORFs) were 
amplified by the polymerase chain reaction 
(PGR), with a commercially available set of 
primer pairs (8). DNA microarrays, con- 
taining approximately 6400 distinct DNA 
sequences, were printed onto glass slides by 



using a simple robotic printing device (9). 
Cells from an exponentially growing culture 
of yeast were inoculated into fresh medium 
and grown at 30°C for 21 hours. After an 
initial 9 hours of growth, samples were har- 
vested at seven successive 2-hour intervals, 
and mRNA was isolated (JO). Fluorescently 
labeled cDN A was prepared by reverse tran- 
scription in the presence of Cy3(green)- 
or Cy5(red)-labeled deoxyuridine triphos- 
phate (dUTP) (11) and then hybridized to 
the microarrays (12}. To maximize the re- 
liability with which changes in expression 
levels could be discerned, we labeled cDN A 
prepared from cells at each successive time 
point with Cy5, then mixed it with a Cy3- 
labeled "reference" cDNA sample prepared 
from cells harvested at the first interval 
after inoculation. In this experimental de- 
sign, the relative fluorescence intensity 
measured for the Cy3 and Cy5 fluors at 
each array element provides a reliable mea- 
sure of the relative abundance of the corre- 
sponding mRNA in the two cell popula- 
tions (Fig. 1). Data from the series of seven 
samples (Fig. 2), consisting of more than 
43,000 expression-ratio measurements, 
were organized into a database to facilitate 
efficient exploration and analysis of the 
results. This database is publicly available 
on the Internet {13), 

During exponential growth in glucose- 
rich medium, the global pattern of gene 
expression was remarkably stable. Indeed, 
when gene expression patterns between the 
first two cell samples (harvested at a 2-hour 
interval) were compared, mRNA levels dif- 
fered by a factor of 2 or more for only 19 
genes (0.3%), and the largest of these dif- 
ferences was only 2.7-fold ( 14). However, as 
glucose was progressively depleted from the 
growth media during the course of the ex- 
periment, a marked change was seen in the 
global pattern of gene expression. mRNA 
levels for approximately 710 genes were 
induced by a factor of at least 2, and the 
mRNA levels for approximately 1030 genes 
declined by a factor of at least 2. Messenger 
RNA levels for 183 genes increased by a 
factor of at least 4, and mRNA levels for 
203 genes diminished by a factor of at least 
4. About half of these differentially ex- 
pressed genes have no currently recognized 
function and are not yet named. Indeed, 
more than 400 of the differentially ex- 
pressed genes have no apparent homology 
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to any gene whose function is known (15). 
The responses of these previously unchar- 
acterized genes to the diauxic shift therefore 
provides the first small clue to their possible 
roles. 

The global view of changes in expres- 
sion of genes with known functions pro- 
vides a vivid picture of the way in which 
the cell adapts to a changing environ- 
ment. Figure 3 shows a portion of the yeast 
metabolic pathways involved in carbon 
and energy metabolism. Mapping the 
changes we observed in the mRNAs en- 
coding each enzyme onto this framework 
allowed us to infer the redirection in the 
flow of metabolites through this system. 
We observed large inductions of the genes 
coding for the enzymes aldehyde dehydro- 
genase (ALD2) and acetyl-coenzyme 
A(CoA) synthase (ACSi), which func- 
tion together to convert the products of 
alcohol dehydrogenase into acetyl-CoA, 
which in turn is used to fuel the tricarbox- 
ylic acid (TCA) cycle and the glyoxylate 
cycle. The concomitant shutdown of tran- 
scription of the genes encoding pyruvate 
decarboxylase and induction of pyruvate 
carboxylase rechannels pyruvate away 
from acetaldehyde, and instead to oxalac- 
etate, where it can serve to supply the 
TCA cycle and gluconeogenesis. Induc- 
tion of the pivotal genes PCKl, encoding 
phosphoenolpyruvate carboxykinase, and 
FBPJ, encoding fructose 1,6-biphos- 
phatase, switches the directions of two key 
irreversible steps in glycolysis, reversing 
the flow of metabolites along the revers- 
ible steps of the glycolytic pathway toward 
the essential biosynthetic precursor, glu- 
coses-phosphate. Induction of the genes 
coding for the trehalose synthase and gly- 
cogen synthase complexes promotes chan- 
neling of glucose-6-phosphate into these 
carbohydrate storage pathways. 

Just as the changes in expression of 
genes encoding pivotal enzymes can pro- 
vide insight into metabolic reprogram- 
ming, the behavior of large groups of func- 
tionally related genes can provide a broad 
view of the systematic way in which the 
yeast cell adapts to a changing environ- 
ment (Fig. 4). Several classes of genes, 
such as cytochrome c-related genes and 
those involved in the TCA/glyoxylate cy- 
cle and carbohydrate storage, were coordi- 
nated induced by glucose exhaustion. In 
contrast, genes devoted to protein synthe- 
sis, including ribosomal proteins, tRNA 
synthetases, and translation, elongation, 
and initiation factors, exhibited a coordi- 
nated decrease in expression. More than 
95% of ribosomal genes showed at least 
twofold decreases in expression during the 
diauxic shift (Fig. 4) (13). A noteworthy 
and illuminating exception was that the 



genes encoding mitochondrial ribosomal 
genes were generally induced rather than 
repressed after glucose limitation, high- 
lighting the requirement for mitchondrial 
biogenesis (J3). As more is learned about 
the functions of every gene in the yeast 
genome, the ability to gain insight into a 
cell's response to a changing environment 
through its global gene expression patterns 
will become increasingly powerful. 

Several distinct temporal patterns of ex- 
pression could be recognized, and sets of 
genes could be grouped on the basis of the 
similarities in their expression patterns. The 
characterized members of each of these 
groups also shared important similarities in 
their functions. Moreover, in most cases, 
common regulatory mechanisms could be 
inferred for sets of genes with similar expres- 
sion profiles. For example, seven genes 
showed a late induction profile, with mRNA 
levels increasing by more than ninefold at 
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the last timepoint but less than threefold at 
the preceding timepoint (Fig. 5B). All of 
these genes were known to be glucose-re- 
pressed, and five of the seven were previously 
noted to share a common upstream activat- 
ing sequence (UAS), the carbon source re- 
sponse element (CSRE) (16-20). A search 
in the promoter regions of the remaining two 
genes, ACRJ and IDP2, revealed that 
ACR1, a gene essential for ACS J activity, 
also possessed a consensus CSRE motif, but 
interestingly, IDP2 did not. A search of the 
entire yeast genome sequence for the con- 
sensus CSRE motif revealed only four addi- 
tional candidate genes, none of which 
showed a similar induction. 

Examples from additional groups of 
genes that shared expression profiles are 
illustrated in Fig. 5, C through F. The 
sequences upstream of the named genes in 
Fig. 5C all contain stress response elei 
ments (STRE), and with the exception 
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Fig. 1. Yeast genome microarray. The actual size of the microarray is .18 mm by 18 mm. The 
microarray was printed as described (9). This image was obtained with the same fluorescent 
scanning confocal microscope used to collect all the data we report {49). A fluorescently labeled 
cDNA probe was prepared from mRNA isolated from cells harvested shortly after inoculation (culture 
density of <5 x 10 6 cells/ml and media glucose level of 19 g/liter) by reverse transcription in the 
presence of Cy3-dUTP. Similarly, a second probe was prepared from mRNA isolated from cells taken 
from the same culture 9.5 hours later (culture density of ~2 x 10 8 cells/ml, with a glucose level of 
<0.2 g/liter) by reverse transcription in the presence of Cy5-dUTP. In this image, hybridization of the 
Cy3-dUTP-labeled cDNA (that is, mRNA expression at the initial timepoint) is represented as a green 
signal, and hybridization of Cy5-dUTP-labeled cDNA (that is, mRNA expression at 9.5 hours) is 
represented as a red signal. Thus, genes induced or repressed after the diauxic shift appear in this 
image as red and green spots, respectively. Genes expressed at roughly equal levels before and after 
the diauxic shift appear in this image as yellow spots. 



www.sciencemag.org • SCIENCE • VOL. 278 • 24 OCTOBER 1997 



681 



of HSP42, have previously been shown to 
be controlled at least in part by these 
elements (2 J -24). Inspection of the se- 
quences upstream of HSP42 and the two 
uncharacterized genes shown in Fig. 5C, 
YKL026c, a hypothetical protein with 
similarity to glutathione peroxidase, and 
YGR043c, a putative transaldolase, re- 
vealed that each of these genes also pos- 
sess repeated upstream copies of the stress- 
responsive CCCCT motif. Of the 13 ad- 
ditional genes in the yeast genome that 
shared this expression profile [including 
HSP30, ALD2, OM45, and 10 uncharac- 
terized ORFs (25)], nine contained one or 
more recognizable STRE sites in their up- 
stream regions. 

The heterotrimeric transcriptional acti- 
vator complex HAP2 t 3,4 has been shown 
to be responsible for induction of several 
genes important for respiration (26-28). 
This complex binds a degenerate consensus 
sequence known as the CCA AT box (26). 
Computer analysis, using the consensus se- 
quence TNRYTGGB (29), has suggested 
that a large number of genes involved in 
respiration may be specific targets of 
HAP2,3,4 (30). Indeed, a putative 
HAP2 t 3,4 binding site could be found in 
the sequences upstream of each of the seven 
cytochrome c-related genes that showed 
the greatest magnitude of induction (Fig. 
5D). Of 12 additional cytochrome c-related 
genes that were induced, HAP2,3,4 binding 
sites were present in all but one. Signifi- 
cantly, we found that transcription of 
HAP4 itself was induced nearly ninefold 
concomitant with the diauxic shift. 

Control of ribosomal protein biogenesis 
is mainly exerted at the transcriptional 
level, through the presence of a common 
upstream-activating element (UAS ) 
that is recognized by the Rap I DNA-bind- 
ing protein (31, 32). The expression pro- 
files of seven ribosomal proteins are shown 
in Fig. 5F. A search of the sequences 
upstream of all seven genes revealed con- 
sensus Rapl-binding motifs (33). It has 
been suggested that declining Rapl levels 
in the cell during starvation may be re- 
sponsible for the decline in ribosomal pro- 
tein gene expression (34). Indeed, we ob- 
served that the abundance of RAPi 
mRNA diminished by 4.4-fold, at about 
the time of glucose exhaustion. 

Of the 149 genes that encode known or 
putative transcription factors, only two, 
HAP4 and SIP4, were induced by a factor of 
more than threefold at the diauxic shift. 
S1P4 encodes a DNA-binding transcrip- 
tional activator that has been shown to 
interact with Snfl, the "master regulator" of 
glucose repression (35). The eightfold in- 
duction of S/P4 upon depletion of glucose 
strongly suggests a role in the induction of 



downstream genes at the diauxic shift. 

Although most of the transcriptional 
responses that we observed were not pre- 
viously known, the responses of many 
genes during the diauxic shift have been 
described. Comparison of the results we 
obtained by DNA microarray hybridiza- 
tion with previously reported results there- 
fore provided a strong test of the sensitiv- 
ity and accuracy of this approach. The 
expression patterns we observed for previ- 
ously characterized genes showed almost 
perfect concordance with previously pub- 
lished results (36). Moreover, the differ- 
ential expression measurements obtained 
by DNA microarray hybridization were re- 
producible in duplicate experiments. For 
example, the remarkable changes in gene 
expression between cells harvested imme- 
diately after inoculation and immediately 
after the diauxic shift (the first and sixth 
intervals in this time series) were mea- 
sured in duplicate, independent DNA mi- 
croarray hybridizations. The correlation 
coefficient for two complete sets of expres- 
sion ratio measurements was 0.87, and for 
more than 95% of the genes, the expres- 



sion ratios measured in these duplicate 
experiments differed by less than a factor 
of 2. However, in a few cases, there were 
discrepancies between our results and pre- 
vious results, pointing to technical limita- 
tions that will need to be addressed as 
DNA microarray technology advances 
(37, 38). Despite the noted exceptions, 
the high concordance between the results 
we obtained in these experiments and 
those of previous studies provides confi- 
dence in the reliability and thoroughness 
of the survey. 

The changes in gene expression during 
this diauxic shift are complex and involve 
integration of many kinds of information 
about the nutritional and metabolic state 
of the cell. The large number of genes 
whose expression is altered and the diver- 
sity of temporal expression profiles ob- 
served in this experiment highlight the 
challenge of understanding the underlying 
regulatory mechanisms. One approach to 
defining the contributions of individual 
regulatory genes to a complex program of 
this kind is to use DNA microarrays to 
identify genes whose expression is affected 



Fig. 2. The section of the ar- 
ray indicated by the gray box 
in Fig. 1 is shown for each of 
the experiments described 
here. Representative genes 
are labeled. In each of the ar- 
rays used to analyze gene 
expression during the diauxic 
shift, red spots represent 
genes that were induced rel- 
ative to the initial timepoint, 
and green spots represent 
genes that were repressed 
relative to the initial timepoint. 
In the arrays used to analyze 
the effects of the tuplb mu- 
tation and YAP1 overexpres- 
sion, red spots represent 
genes whose expression was 
increased, and green spots 
represent genes whose ex- 
pression was decreased by 
the genetic modification. Note 
that distinct sets of genes are 
induced and repressed in the 
different experiments. The 
complete images of each of 
these arrays can be viewed on 
the Internet (13). Cell density 
as measured by optical densi- 
ty (00) at 600 nm was used to 
measure the growth of the 
culture. 
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by mutations in each putative regulatory 
gene. As a test of this strategy, we analyzed 
the genomewide changes in gene expression 
that result from deletion of the TUPl gene. 
Transcriptional repression of many genes by 
glucose requires the DNA-binding repressor 



Migl and is mediated by recruiting the tran- 
scriptional co-repressors Tupl and Cyc8/ 
Ssn6 (39). Tupl has also been implicated in 
repression of oxygen-regulated, mating- type- 
specific, and DNA-damage-inducible genes 
(40). 



^ Branchin g 
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Fig. 3. Metabolic reprogramming inferred from global analysis of changes in gene expression. Only key 
metabolic intermediates are identified. The yeast genes encoding the enzymes that catalyze each step 
in this metabolic circuit are identified by name in the boxes. The genes encoding succinyl-CoA synthase 
and glycogen-debranching enzyme have not been explicitly identified, but the ORFs YGR244 and 
YPR184 show significant homology to known succinyl-CoA synthase and glycogen-debranching en- 
zymes, respectively, and are therefore included in the corresponding steps in this figure. Red boxes with 
white lettering identify genes whose expression increases in the diauxic shift. Green boxes with dark 
green lettering identify genes whose expression diminishes in the diauxic shift. The magnitude of 
induction or repression is indicated for these genes. For multimeric enzyme complexes, such as 
succinate dehydrogenase, the indicated fold-induction represents an unweighted average of all the 
genes listed in the box. Black and white boxes indicate no significant differential expression (less than 
twofold). The direction of the arrows connecting reversible enzymatic steps indicate the direction of the 
flow of metabolic intermediates, inferred from the gene expression pattern, after the diauxic shift. Arrows 
representing steps catalyzed by genes whose expression was strongly induced are highlighted in red. 
The broad gray arrows represent major increases in the flow of metabolites after the diauxic shift, 
inferred from the indicated changes in gene expression. 



Wild-type yeast cells and cells bearing 
a deletion of the TUPl gene (tupl A) were 
grown in parallel cultures in rich medium 
containing glucose as the carbon source. 
Messenger RNA was isolated from expo- 
nentially growing cells from the two pop- 
ulations and used to prepare cDNA la- 
beled with Cy3 (green) and Cy5 (red), 
respectively (J 1). The labeled probes were 
mixed and simultaneously hybridized to 
the microarray. Red spots on the microar- 
ray therefore represented genes whose . 
transcription was induced in the tup J A 
strain, and thus presumably repressed by 
Tupl (41). A representative section of the 
microarray (Fig. 2, bottom middle panel) 
illustrates that the genes whose expression 
was affected by the tupl A mutation, were, 
in general, distinct from those induced 
upon glucose exhaustion (complete images 
of all the arrays shown in Fig. 2 are avail- 
able on the Internet (13)]. Nevertheless, 
34 (10%) of the genes that were induced 
by a factor of at least 2 after the diauxic 
shift were similarly induced by deletion of 
TUPl, suggesting that these genes may be 
subject to TUPl -mediated repression by 
glucose. For example, SUC2, the gene en- 
coding invertase, and all five hexose trans- 
porter genes that were induced during the 
course of the diauxic shift were similarly 
induced, in duplicate experiments, by the 
deletion of TUPl. 

The set of genes affected by Tupl in this 
experiment also included a-glucosidases, 
the mating- type-specific genes MFA1 and 
MFA2, and the DNA damage-inducible 
KNR2 and RNR4, as well as genes involved 
in flocculation and many genes of unknown 
function. The hybridization signal corre- 
sponding to expression of TUPl itself was 
also severely reduced because of the (in- 
complete) deletion of the transcription unit 
in the tupl A strain, providing a positive 
control in the experiment (42). 

Many of the transcriptional targets of 
Tupl fell into sets of genes with related 
biochemical functions. For instance, al- 
though only about 3% of all yeast genes 
appeared to be TUPl -repressed by a factor 
of more than 2 in duplicate experiments 
under these conditions, 6 of the 13 genes 
that have been implicated in flocculation 
(15) showed a reproducible increase in 
expression of at least twofold when TUP J 
was deleted. Another group of related 
genes that appeared to be subject to TUPl 
repression encodes the serine-rich cell 
wall mannoproteins, such as Tipl and 
Tirl/Srpl which are induced by cold 
shock and other stresses (43), and similar, 
serine-poor proteins, the seripauperins 
(44). Messenger RNA levels for 23 of the 
26 genes in this group were reproducibly 
elevated by at least 2.5-fold in the tupl^ 
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strain, and 18 of these genes were induced 
by more than sevenfold when TUP J was 
deleted. In contrast, none of 83 genes that 
could be classified as putative regulators of 
the cell division cycle were induced more 
than twofold by deletion of TUPl. Thus, 
despite the diversity of the regulatory sys- 
tems that employ Tupl, most of the genes 
that it regulates under these conditions 
fall into a limited number of distinct func- 
tional classes. 

Because the microarray allows us to 
monitor expression of nearly every gene in 
yeast, we can, in principle, use this ap- 
proach to identify all the transcriptional 
targets of a regulatory protein like Tupl. It 
is important to note, however, that in any 
single experiment of this kind we can only 
recognize those target genes that are nor- 
mally repressed (or induced) under the 
conditions of the experiment. For in- 
stance, the experiment described here an- 
alyzed a MAT a strain in which MFAJ 
and MFA2, the genes encoding the a- 
factor mating pheromone precursor, are 
normally repressed. In the isogenic tupJA 
strain, these genes were inappropriately 
expressed, reflecting the role that Tupl 
plays in their repression. Had we instead 
carried out this experiment with a MATA 
strain (in which expression of MFA1 and 
MFA2 is not repressed), it would not have 
been possible to conclude anything re- 
garding the role of Tupl in the repression 
of these genes. Conversely, we cannot dis- 
tinguish indirect effects of the chronic 
absence of Tupl in the mutant strain from 
effects directly attributable to its partici- 
pation in repressing the transcription of a 
gene. 

Another simple route to modulating the 
activity of a regulatory factor is to overex- 
press the gene that encodes it. YAP J en- 
codes a DNA-binding transcription factor 
belonging to the b-zip class of DNA-bind- 
ing proteins. Overexpression of YAP1 in 
yeast confers increased resistance to hydro- 
gen peroxide, o-phenanthroline, heavy 
metals, and osmotic stress (45). We ana- 
lyzed differential gene expression between a 
wild-type strain bearing a control plasmid 
and a strain with a plasmid expressing YAP] 
under the control of the strong GALl-10 
promoter, both grown in galactose (that is, 
a condition that induces YAP1 overexpres- 
sion). Complementary DNA from the con- 
trol and YAP I overexpressing strains, la- 
beled with Cy3 and Cy5, respectively, was 
prepared from mRNA isolated from the two 
strains and hybridized to the microarray. 
Thus, red spots on the array represent genes 
that were induced in the strain overexpress- 
ing YAP J. 

Of the 17 genes whose mRNA levels 
increased by more than threefold when 



YAP1 was overexpressed in this way, five 
bear homology to aryl-alcohol oxidoreduc- 
tases (Fig. 2 and Table 1). An additional 
four of the genes in this set also belong to 
the general class of dehydrogenases/oxi- 
doreductases. Very little is known about 
the role of aryl-alcohol oxidoreductases in 
S. cerevisiae, but these enzymes have been 
isolated from ligninolytic fungi, in which 
they participate in coupled redox reac- 
tions, oxidizing aromatic, and aliphatic 
unsaturated alcohols to aldehydes with the 
production of hydrogen peroxide (46, 47). 
The fact that a remarkable fraction of the 
targets identified in this experiment be- 
long to the same small, functional group of 
oxidoreductases suggests that these genes 



might play an important protective role 
during oxidative stress. Transcription of a 
small number of genes was reduced in the 
strain overexpressing Yapl. Interestingly, 
many of these genes encode sugar per- 
meases or enzymes involved in inositol 
metabolism. 

We searched for Yapl -binding sites 
(TTACTAA or TGACTAA) in the se- 
quences upstream of the target genes we 
identified (48). About two-thirds of the 
genes that were induced by more than 
threefold upon Yapl overexpression had 
one or more binding sites within 600 bases 
upstream of the start codon (Table 1), sug- 
gesting that they are directly regulated by 
Yapl. The absence of canonical Yapl-bind- 
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Fig. 4. Coordinated reg- 
ulation of functionally re- 
lated genes. The curves 
represent the average in- 
duction or repression ra- 
tios for all the genes in 
each indicated group. 
The total number of 
genes in each group was 
as follows: ribosomal 
proteins, 112; translation 
elongation and initiation 

factors, 25; tRNA synthetases (excluding mitochondial synthetases), 1 7; glycogen and trehalose syn- 
thesis and degradation, 15; cytochrome c oxidase and reductase proteins, 19; and TCA- and glyoxy- 
late-cycle enzymes, 24. 

Table 1 . Genes induced by YAP1 overexpression. This list includes all the genes for which mRNA levels 
increased by more than twofold upon YAP1 overexpression in both of two duplicate experiments, and 
for which the average increase in mRNA level in the two experiments was greater than threefold (50). 
Positions of the canonical Yapl binding sites upstream of the start codon, when present, and the 
average fold-increase in mRNA levels measured in the two experiments are indicated. 
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Distance of Yapl 
site from ATG 


Gene 
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Fold- 
increase 


YNL331C 






Putative aryl-alcohol reductase 


12.9 


YKL071W 


162-222 (5 sites) 




Similarity to bacterial csgA protein 


10.4 


YML007W 




YAP1 


Transcriptional activator involved in 
oxidative stress response 


9.8 


YFL056C 


223, 242 




Homology to aryl-alcohol 
dehydrogenases 


9.0 


YLL060C 


98 




Putative glutathione transferase 


7.4 


YOL165C 


266 




Putative aryl-alcohol dehydrogenase 
(NADP+) 


7.0 


YCR107W 






Putative aryl-alcohol reductase 


6.5 


YML116W 


409 


ATR1 


Aminotriazole and 4-nitroquinoline 
resistance protein 


6.5 


YBR008C 


142, 167, 364 




Homology to benomyt/methotrexate 
resistance protein 


6.1 


YCLX08C 






Hypothetical protein 


6.1 


YJR155W 






Putative aryl-alcohol dehydrogenase 


6.0 


YPL171C 


148, 212 


OYE3 


NAPDH dehydrogenase (old yellow 
enzyme), isoform 3 


5.8 


YLR460C 


167, 317 




Homology to hypothetical proteins 
YCR102cand YNL134c 


4.7 


YKR076W 


178 




Homology to hypothetical protein 
YMR251w 


4.5 


YHR179W 


327 


OYE2 


NAD(P)H oxidoreductase (old yellow 
enzyme), isoform 1 


4.1 


YML131W 


507 




Similarity to A. thaliana zeta-crystallin 
homolog 


3.7 


YOL126C 




MDH2 


Malate dehydrogenase 


3.3 
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ing sites upstream of the others may reflect 
an ability of Yapl to bind sites that differ 
from the canonical binding sites, perhaps in 
cooperation with other factors, or less like- 
ly, may represent an indirect effect of Yapl 
overexpression, mediated by one or more 
intermediary factors. Yapl sites were found 
only four times in the corresponding region 
of an arbitrary set of 30 genes that were not 
differentially regulated by Yapl. 

Use of a DNA microarray to character- 
ize the transcriptional consequences of 
mutations affecting the activity of regula- 
tory molecules provides a simple and pow- 
erful approach to dissection and character- 
ization of regulatory pathways and net- 



works. This strategy also has an important 
practical application in drug screening. 
Mutations in specific genes encoding can- 
didate drug targets can serve as surrogates 
for the ideal chemical inhibitor or modu- 
lator of their activity. DNA microarrays 
can be used to define the resulting signa- 
ture pattern of alterations in gene expres- 
sion, and then subsequently used in an 
assay to screen for compounds that repro- 
duce the desired signature pattern. 

DNA microarrays provide a simple and 
economical way to explore gene expres- 
sion patterns on a genomic scale. The 
hurdles to extending this approach to any 
other organism are minor. The equipment 
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Fig. 5. Distinct temporal patterns of induction or repression help to group genes that share regulatory 
properties. (A) Temporal profile of the cell density, as measured by OD at 600 nm and glucose 
concentration in the media. (B) Seven genes exhibited a strong induction (greater than ninefold) only at 
the last timepoint (20.5 hours). With the exception of IDP2, each of these genes has a CSRE UAS. There 
were no additional genes observed to match this profile. (C) Seven members of a class of genes marked 
by early induction with a peak in mRNA levels at 18.5 hours. Each of these genes contain STRE motif 
repeats in their upstream promoter regions. (D) Cytochrome c oxidase and ubiquinol cytochrome c 
reductase genes. Marked by an induction coincident with the diauxic shift, each of these genes contains 
a consensus binding motif for the HAP2,3,4 protein complex. At least 17 genes shared a similar 
expression profile. (E) SAM1, GPP1, and several genes of unknown function are repressed before the 
diauxic shift, and continue to be repressed upon entry into stationary phase. (F) Ribosomal protein 
genes comprise a large class of genes that are repressed upon depletion of glucose. Each of the genes 
profiled here contains one or more RAP1 -binding motifs upstream of its promoter. RAP1 is a transcrip- 
tional regulator of most ribosomal proteins. 



required for fabricating and using DNA 
microarrays (9) consists of components 
that were chosen for their modest cost and 
simplicity. It was feasible for a small group 
to accomplish the amplification of more 
than 6000 genes in about 4 months and, 
once the amplified gene sequences were in 
hand, only 2 days were required to print a 
set of 110 microarrays of 6400 elements 
each. Probe preparation, hybridization, 
and fluorescent imaging are also simple 
procedures. Even conceptually simple ex- 
periments, as we described here, can yield 
vast amounts of information. The value of 
the information from each experiment of 
this kind will progressively increase as 
more is learned about the functions of 
each gene and as additional experiments 
define the global changes in gene expres- 
sion in diverse other natural processes and 
genetic perturbations. Perhaps the greatest 
challenge now is to develop efficient 
methods for organizing, distributing, inter- 
preting, and extracting insights from the 
large volumes of data these experiments 
will provide. 
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We describe here a method for drug target validation and identification of secondary drug tar- 
get effects based on genome-wide gene expression patterns. The method is demonstrated by 
several experiments, including treatment of yeast mutant strains defective in calcineurin, im- 
munophilins or other genes with the immunosuppressants cyclosporin A or FK506. Presence or 
absence of the characteristic drug 'signature' pattern of altered gene expression in drug-treated 
cells with a mutation in the gene encoding a putative target established whether that target was 
required to generate the drug signature. Drug dependent effects were seen in 'targetless' ceils, 
showing that FK506 affects additional pathways independent of calcineurin and the im- 
munophilins. The described method permits the direct confirmation of drug targets and recog- 
nition of drug-dependent changes in gene expression that are modulated through pathways 
distinct from the drug's intended target. Such a method may prove useful in improving the effi- 
ciency of drug development programs. 



Good drugs are potent and specific; that is, they must have 
strong effects on a specific biological pathway and minimal ef- 
fects on all other pathways. Confirmation that a compound in- 
hibits the intended target (drug target validation) and the 
identification of undesirable secondary effects are among the 
main challenges in developing new drugs. Comprehensive 
methods that enable researchers to determine which genes or 
activities are affected by a given drug might improve the effi- 
ciency of the drug discovery process by quickly identifying po- 
tential protein targets, or by accelerating the identification of 
compounds likely to be toxic. DNA microarray technology, 
which permits simultaneous measurement of the expression 
levels of thousands of genes, provides a comprehensive frame- 
work to determine how a compound affects cellular metabolism 
and regulation on a genomic scale 1 "". DNA microarrays that 
contain essentially every open reading frame (ORF) in the 
Saccharomyces cerevisiae genome have already been used success- 
fully to explore the changes in gene expression that accompany 
large changes in cellular metabolism or cell cycle progression 7 * 10 . 

In the modern drug discovery paradigm, which typically be- 
gins with the selection of a single molecular target, the ideal in- 
hibitory drug is one that inhibits a single gene product so 
completely and so specifically that it is as if the gene product 
were absent. Treating cells with such a drug should induce 
changes in gene expression very similar to those resulting from 
deleting the gene encoding the drug's target. Here we have com- 
pared the genome-wide effects on gene expression that result 
from deletions of various genes in the budding yeast 5. cerevisiae 
to the effects on gene expression that result from treatment 



with known inhibitors of those gene products. Using the cal- 
cineurin signaling pathway as a model system, we tested an ap- 
proach that permits identification of genes that encode proteins 
specifically involved in pathways affected by a drug. The FK506 
characteristic pattern, or 'signature', of altered gene expression 
was not observed in mutant cells lacking proteins inhibited by 
FK506 (for example, a calcineurin or FK506-binding-protein 
mutant strain), but was observed in mutants deleted for genes 
in pathways unrelated to FK506 action (for example, a cy- 
clophilin mutant strain). Conversely, the cyclosporin A (CsA) 
signature was not observed in CsA-treated calcineurin or cy- 
clophilin mutant strains, but was seen in an FK506-binding-pro- 
tein mutant strain treated with CsA. The method also 
demonstrates that FK506, a clinically used immunosuppressant, 
has 'off-target' effects that are independent of its binding to im- 
munophllins. Thus, the approach we describe may provide a 
way to identify the pathways altered by a drug and to detect 
drug effects mediated through unintended targets. 

Null mutants phenocopy drug-treated cells on a genomic scale 
To test whether a null mutation in a drug target serves as a 
model of an ideal inhibitory drug, we examined the effects on 
gene expression associated with pharmacological or genetic in- 
hibition of calcineurin function. Calcineurin is a highly con- 
served calcium- and calmodulin-activated serine/threonine 
protein phosphatase implicated in diverse processes dependent 
on calcium signaling 12 * 13 . In budding yeast, calcineurin is re- 
quired for intracellular ion homeostasis 14 , for adaptation to pro- 
longed mating pheromone treatment 15 and in the regulation of 
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Fig. 1 Model of antagonism of the calcineurin signaling pathway mediated 
by FK506 and cyclosporin A (CsA). Calcineurin activity is composed of a cat- 
alytic subunit (calcineurin A, encoded in yeast by the CNA 7 and CNA2 genes), 
and calcium-binding regulatory subunits calmodulin (CMD) and calcineurin B 
(CnB). After entering cells, FK506 and CsA specifically bind and inhibit the 
peptidyl- proline isomerase activity of their respective immunophilins, FK506 
binding proteins (FKBP) and cyclophilins (CyP). The most abundant im- 
munophilins in yeast (Fpr1 and Cphl) are thought to mediate calcineurin in- 
hibition. DrugHmmunophilin complexes bind and inhibit the calcium- and 
calmodulin-stimulated phosphatase calcineurin. Among the substrates of cal- 
cineurin are transcriptional activators that act to modulate gene expression. 



the onset of mitosis 16 . In mammals, calcineurin has been impli- 
cated in T-cell activation' 2 , in apoptosis 17 , in cardiac hypertro- 
phy 18 and in the transition from short-term to long-term 
memory 19 . In both organisms, calcineurin activity is inhibited 
by FK506 and CsA, immunosuppressant drugs whose effects on 
calcineurin are mediated through families of intracellular recep- 
tor proteins called immunophilins 12 20 (Fig. 1). To assess the ef- 
fects of pharmacologic inhibition of calcineurin, wild-type 5. 
cerevisiae was grown to early logarithmic phase in the presence 
or absence of FK506 or CsA. Isogenic cells, from which the 
genes encoding the catalytic subunits of calcineurin (CNA1 and 
CNA2) had been deleted 21 (referred to as the cna or calcineurin 
mutant), were grown in parallel, in the absence of the drug. 
Fluorescently-labeled cDNA was prepared by reverse transcrip- 
tion of polyA* RNA in the presence of Cy3- or Cy5-deoxynu- 
cleotide triphosphates and then hybridized to a microarray 
containing more than 6,000 DNA probes representing 97% of 
the known or predicted ORFs in the yeast genome. 
Simultaneous hybridization of Cy5-labeled cDNA from mock- 
treated cells and Cy3-labeled cDNA from cells treated with 1 
ug/ml FK506 allowed the effect of drug treatment on mRNA lev- 
els of each ORF to be determined (Fig. 2a and b and data not 
shown). Similarly, effects of the calcineurin mutations on the 
mRNA levels of each gene were assessed by simultaneous hy- 
bridization of Cy5-labeled cDNA from wild-type cells and Cy3- 
labeled cDNA from the calcineurin mutant strain(Fig. 2c). For 
each comparison of this kind, reported expression ratios are the 
average of at least two hybridizations in which the Cy3 and Cy5 
fluors were reversed to remove biases that may be introduced by 
gene-specific differences in incorporation of the two fluors 
(data not shown). 

Treatment with FK506 in these growth conditions resulted in 
a signature pattern of altered gene expression in which mRNA 
levels of 36 ORFs changed by more than twofold 
(http://www.rosetta.org). A very similar pattern of altered gene 
expression was observed when the calcineurin mutant strain 
was compared to wild-type cells. Comparison of the changes in 
mRNA expression of each gene resulting from treatment of 
wild-type cells with FK506 with mRNA expression changes re- 
sulting from deletion of the calcineurin genes showed the con- 
siderable similarity of the global transcript alterations in 
response to the two perturbations (Fig. 2b-d). Quantification of 
this similarity using the correlation coefficient (p) showed 
large correlations between the FK506 treatment signature and 
the calcineurin deletion signature (p = 0.75 ± 0.03), as well as 
the CsA treatment signature (p = 0.94±0.02), but not with a 
randomly selected deletion mutant strain (deleted for the 
YER071C gene; p = -0.07 ± 0.04; Fig. 2e). The FK506 treatment 
signature was also compared with those of more than 40 other 
deletion mutant strains or drug-treatments thought to affect 
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unrelated pathways, and none had statistically significant cor- 
relations. These data establish that genetic disruption of cal- 
cineurin function provides a close and specific phenocopy of 
treatment with FK506 or CsA. 

To avoid generalizing from a single example, we also com- 
pared the effects of treatment of wild-type cells with 3-aminotri- 
azole (3-AT) with the effects of deletion of the HJS3 gene. HIS3 
encodes imidazoleglycerol phosphate dehydratase, which cat- 
alyzes the seventh step of the histidine biosynthetic pathway in 
yeast 22 ; 3-AT is a competitive inhibitor of this enzyme that trig- 
gers a large transcriptional amino-acid starvation response 23 . 
Microarray analysis of wild-type and isogenic Ms3-deficient 
strains demonstrated the expected large genome-wide transcrip- 
tional responses (involving more than 1,000 ORFs) resulting 
from treatment with 3-AT (Fig. 3a) or from HIS3 deletion (Fig. 
3c). Quantitative comparison of the: 3-AT treatment signature 
and the his3 mutant signature showed a high level of correlation 
(p= 0.76 ± 0.02) that even extended to genes that experienced 
small changes in expression level (Fig. 3d). As a negative control, 
the correlations between the 3-AT treatment signature or the 
his3 mutant signature and the calcineurin mutant strain were 
not statistically significant (p = 0.09 ± 0.06 and -0.01 ± 0.04, re- 
spectively). That both the calcineurin/FK506 and the /jjs3/3-AT 
comparisons were highly correlated indicates that in many cases 
the expression profile resulting from a gene deletion closely re- 
sembles the expression profile of wild-type cells treated with an 
inhibitor of that gene s product. 

'Decoder' strategy: Drug target validation with deletion mutants 

Because pharmacological inhibition of different targets might 
give similar or identical expression profiles, simple comparison 
of drug signatures to mutant signatures is unlikely to unambigu- 
ously identify a drug's target. To overcome this limitation, an 
additional 'decoder* step is used. We first compare the expres- 
sion profile of wild-type drug-treated cells to the expression pro- 
files from a panel of genetic mutant strains, using a correlation 
coefficient metric. Mutant strains whose expression profile is 
similar to that of drug-treated wild-type cells are selected and 
subjected to drug treatment, generating the drug signature in 
the mutant strain (that is, the mutant drug signature). If the 
mutated gene encodes a protein involved in a pathway affected 
by the drug, we expect the drug signature in mutant cells to be 
different (or absent, for an ideal drug) from the drug signature 
seen in wild-type cells. 
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Fig. 2 Expression profiles from 
FK506-treated wild-type (wt) 
cells and a calcineurin-disruption 
mutant strain share a genome- 
wide correlation. DNA microarray 
analysis showing changes in gene 
expression resulting from FK506 
treatment (a and b) or from ge- 
netic disruption of genes encod- 
ing calcineurin (c). a, Pseudo- 
color image of the results of si- 
multaneous hybridization of Cy5- 
labeled cDNA (red) from 
mock-treated strain R563 and Cy3-labeled cDNA 
(green) from strain R563 treated with 1 ng/ml FK506. 
b, Enlarged view of the boxed area in a. Arrowheads in- 
dicate specific ORFs induced or repressed, e. Pseudo- 
color image of the results of simultaneous hybridization 
of Cy5-labeled cDNA (red) from strain R563 and Cy3- 
labeled cDNA (green) from strain MCY300 (deleted for 
the CNA1.CNA2 catalytic subunits of calcineurin). 
Arrows indicate specific ORFs induced or repressed, d, 
The log 10 of the expression ratio for each ORF derived 
from the FK506 treatment hybridizations is plotted ver- 
sus the log, 0 of the expression ratio in the calcineurin 
mutant hybridizations. ORFs that were induced or re- 
pressed in both experiments are shown as green and 
red dots, respectively, e, The log 10 of the expression ratio for each ORF de- 
rived from the FK506 treatment hybridizations is plotted versus the log, 0 



wt -/+ 1 ng/ml FK506 



wt vs. catcinerurin mutant 




-ft- 



2 

8* 



3 «*v* 



Log 10 (R/G) calcineurin mutation 



Log, 0 (R/G) yer071c mutation 



of the expression ratio in the yer071c mutant hybridizations. No ORFs 
were induced or repressed in both experiments. 



To illustrate this, we treated the his3 mutant strain with 3- 
AT. The signature pattern of altered gene expression resulting 
from treatment of the mutant strain with 3-AT was much less 
complex than that of the 3-AT signature in wild-type cells (Fig. 
4). This is seen simply by examining plots of mean intensity of 
the hybridization signal (which approximately reflects level of 
expression) versus the expression ratio for each ORF (Fig. 4). 
Genes that were expressed at higher or lower levels in 3-AT 
treated cells or in his3 mutant cells are shown as red and green 
dots, respectively. We analyzed the 3-AT signature in wild-type 
(Fig. 4a) and his3 mutant cells (Fig. 4c), as well as the his3 mu- 
tant strain signature (Fig. 4b). Whereas histidine limitation in- 
duced by 3-AT induced more than 1,000 transcription-level 
changes in the wild-type strain, few or no transcript level 
changes were induced by treatment of the /ifc3-deletion strain 
with 3-AT. This indicates that with the growth conditions used, 
essentially all of the effects of 3-AT depend on or are mediated 
through the HIS3 gene product. 

Applying this approach to the calcineurin signaling pathway 
showed the specificity of the method. The calcineurin mutant 
strain and strains with deletions in the genes encoding the 
most abundant immunophilins in yeast' 2 (CPH1 and FPR1) 
were treated with either FK506 or CsA to determine the profiles 



Table 1 Signature correlation of expression ratios as a result of FK506 
treatment in various mutant strains 



wild-type 
+/-FK506 



cna 
+/-FK506 



fpri 
+/-FK506 



cna fpri 
+/-FK506 



wild-type 
+/- FK506 



0.93 ± 0.04 -0.01 ± 0.07 -0.23 ± 0.07 0.1 2 ± 0.07 0.79 ± 0.03 



Signature correlation shows the absence of the FK506 signature specifically in the calcineurin (cna) and fpri 
(major FK506 binding protein) deletion mutants, cna represents the mutant with deletions of the catalytic sub- 
units of calcineurin. CNA1 and CNA2. The correlation coefficient reported in the first column represents the cor- 
relation between two pairs of hybridizations from independent wild-type +/- FK506 experiments. 



of altered gene expression resulting from drug treatment of the 
mutant cells (that is, mutant +/- drug). We compared the drug 
signatures in the mutants to the wild-type drug signature using 
the correlation coefficient metric (Table 1). Although the signa- 
ture generated by treatment of wild-type cells with FK506 was 
highly correlated to the calcineurin mutant strain signature (p 
= 0.75 ± 0.03), it bore no similarity to the profile after treat- 
ment of the calcineurin mutant strain with FK506 (p = -0.01 ± 
0.07). This indicates that FK506 was unable to elicit its normal 
transcriptional response in the calcineurin mutant strain. 
Likewise, treatment of the fpri mutant strain with FK506 
elicited an expression profile that was not correlated to the 
FK506 signature in the wild-type strain (p = -0.23 ± 0.07), indi- 
cating that the FPRI gene product is likely to be involved in the 
pathway affected by FK506. The same was true for the cna fpri 
mutant strain. In contrast, treatment of the cphl mutant strain 
with FK506 generated an expression profile highly correlated 
with the wild-type FK506 expression profile (p = 0.79 ± 0.03). 
indicating the cphl mutation did not block the mode of action 
of FK506 and thus is not directly involved in the pathway af- 
fected by FK506. We tabulated the change in expression in re- 
sponse to FK506 in different mutant strains for all ORFs with 
expression ratios greater than 1.8 in FK506-treated cells or in 
the calcineurin mutant strain (Fig. 5a). The 
calcineurin mutant strain signature and the 
FK506 responses in wild-type and the cphl 
mutant strain are similar, and there are no 
transcript-level changes (seen in black) for 
treatment of the calcineurin, fpri and cna 
fpri mutant strains with FK506 (Fig. 5a). 

Similar experiments and analyses with CsA 
provided further validation of this approach. 
The expression profile elicited by treatment 
of wild-type cells with CsA was highly corre- 



cphl 
+/-FK506 
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wt-/+ 10mM 3-AT 



Fig. 3 Expression profiles 
from a his3 mutant strain 
and wild-type (wt) cells 
treated with 3-AT share a 
genome- wide correlation. 
DNA microarray analysis 
showing changes in gene 
expression resulting from 3- 
AT treatment (a) or from ge- 
netic disruption of the HIS3 
gene (c). a, Pseudo-color 
image of the results of simul- 
taneous hybridization of 

Cy 5- labeled cDNA (red) from mock-treated wild-type strain R491 and 
Cy3-labeled cDNA (green) from strain R491 treated with 10 mM 3-AT. 
b. Plot of the log, 0 of the expression ratio for each ORF derived from the 
3-AT treatment hybridizations is plotted versus the log 10 of the expression 
ratio in the his3 mutant hybridizations. ORFs that were induced or re- 
pressed in both experiments are shown as green and red dots, respec- 
tively. The correlation of expression ratios applies not only to genes with 
large expression ratios (for example, CHA1 and ARG1), but also extends to 
genes with expression ratios less than 2 (for example. ILV1 and CPH1). 
ILV1 is induced 1.9-fold and 1.5-fold, and CPH1 is downregulated 1 .9-fold 




AMCl 



wt vs. his3 mutation 




CHAJ 




Log,,, (R/G) his3 mutation 



and 1 .7-fold, in ceils treated with 3-AT and his3 mutant ceils, respectively. 
Two ORFs do not fall on the line x = y. The leftmost point is the HIS3 data 
point, which is induced by 3-AT treatment but which is not absent from 
the his3 mutant strain. The other point is YOR203W. Both data points are 
labeled HIS3 because hybridization to YOR203w is most likely due to HIS3 
mRNA, as YOR203w overlaps the HIS3 open reading frame, c. Pseudo- 
color image of the results of simultaneous hybridization of Cy5-labeled 
cDNA (red) from wild-type strain R491 and Cy3-labeled cDNA (green) 
from strain R1226. deleted for the HIS3 gene. Arrowheads indicate spe- 
cific ORFs induced or repressed. 



lated to the profile elicited by mutation of the calcineurin genes 
(p = 0.71 ± 0.04), but did not correlate with the expression pro- 
file resulting from treatment of the calcineurin mutant strain 
with CsA (p = -0.05 ± 0.07; Table 2), indicating that the genetic 
deletion of calcineurin interfered with the ability of CsA to 
elicit its normal transcriptional response. Likewise, the CsA sig- 
nature was essentially absent in CsA- treated cphl mutant cells, 
and the expression profile of CsA-treated cphl mutant cells cor- 
related poorly to that of CsA-treated wild-type cells (p = 0.18 ± 
0.07). Thus, the CPH1 gene product was required for the CsA re- 
sponse seen in wild-type cells. Conversely, treatment of fprl 
mutant cells with CsA resulted in an expression pattern very 
similar to the profile of CsA-treated wild-type cells (p = 0.77 ± 
0.03), indicating that FPR1 was not necessary for the CsA- medi- 
ated effects. Analysis of individual ORFs affected by CsA and 
their expression ratios over the entire set of experiments con- 
firmed that CPH1 and the genes encoding calcineurin, but not 



FPRL are necessary for the wild-type CsA response (Fig. 5b). The 
observation that the profiles resulting from FK506 or CsA drug 
treatment are similar to that of the calcineurin deletion mutant 
strain might allow the prediction that calcineurin was involved 
in the pathway affected by these drugs. But because the expres- 
sion profile of the fprl mutant strain did not bear a strong simi- 
larity to the wild-type drug expression profile for FK506, it is 
obvious that the drug treatment of the mutant strains was nec- 
essary to identify Fprl, but not Cphl, as a potential FK506 drug 
target. In the same way, the 'decoder' strategy was necessary to 
identify Cphl, but not Fprl, as a potential drug target for CsA. 

'Decoder' approach can identify secondary drug effects 

For a drug that has a single biochemical target, the strategy out- 
lined above may be useful in target validation. In many cases, 
however, a compound may affect multiple pathways and elicit 
a very complex signature. 'Decoding' such a complex signature 



wt-/+ 10mM 3-AT 



wt vs. his3 mutant 



his3 mutant -/+ 10mM 3-AT 
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Fig. 4 Treatment of the his3 mutant strain with 3-AT shows nearly com- 
plete loss of 3-AT signature. A plot of the log, 0 of the mean intensity of hy- 
bridization for each ORF versus the log t0 of its expression ratio for each 
experiment is shown next to a pseudo-color image of a representative 
portion of the microarray. ORFs that are induced or repressed at the 95% 
confidence level are shown in green and red, respectively, a, Expression 
profile from treatment of the wild-type (wt) strain with 3-AT. Cy5-labeled 
cDNA (red) from mock-treated strain R491 and Cy3-labeled cDNA 
(green) from strain R491 treated with 10 mM 3-AT. b. Expression profile 



from the his3 deletion strain. Cy5-labeled cDNA (red) from strain R491 
and Cy3-labeled cDNA (green) from strain R1226. deleted for the HIS3 
gene, c, Expression profile of treatment of the his3 deletion strain with 3- 
AT. Cy3-Iabeled cDNA (red) from rws3-deleted strain R1226 and Cy5-la- 
beled cDNA (green) from strain R1226 treated with 10 mM 3-AT. 
Arrowheads indicate the DNA probe and data point corresponding to the 
HIS3 gene. The blue dashed line represents the threshold below which er- 
rors tend to increase rapidly because spot intensities are not sufficiently 
above background intensity. 
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Table 2 Signature correlation of expression ratios as a result of CsA 
treatment in various mutant strains 





wild-type 


cna 


fprl 


cna cphl 


cphl 




+/-CsA 


+/-CsA 


♦/-CsA 


♦/-CsA 


+/-CsA 


wild-type 












♦/- CsA 


0.94 ± 0.04 


-0.05 ± .07 


0.77 ±0.03 


-0.11 ±0.07 


0.18 ±0.07 



Signature correlation shows the absence of the CsA signature specifically in the calcineurin {cna) and cphl 
(cyclophilin) deletion mutants, cna represents the mutant with deletions of the catalytic subunits of cal- 
cineurin, CNA 1 and CNA2. The correlation coefficient reported in the first column represents the correlation 
between two pairs of hybridizations from independent wild-type +/- CsA experiments. 



into the effects mediated through the intended target (the 'on- 
target signature') and those mediated through unintended tar- 
gets (the 'off-target' signature) might be useful in evaluating a 
compound's specificity. Our 'decoder' strategy is based on the 
premise that 'off-target' signature should be insensitive to the 
genetic disruption of the primary target. 

To determine whether the 'decoder* approach could identify 
an 'off-target' profile, we looked for a drug-responsive gene 
whose expression is insensitive to deletion of the primary tar- 
get. To increase the likelihood of observing such genes, the 
same strains described in Tables 1 and 2 were treated with 
higher concentrations (50 ug/ml) of FK506. This led to a much 
more complex expression profile in wild-type cells, indicating 
that at this higher concentration, FK506 was inhibiting or acti- 
vating additional targets. Several of the ORFs in this expanded 
FK506-induced expression profile were not affected by the cal- 
cineurin, cphl or fprl mutations, as drug treatment of these mu- 
tant strains did not block their presence in the FK506 
expression signature (Fig. 6). This indicates that FK506 was trig- 
gering changes in transcript levels of many genes through path- 
ways independent of calcineurin, CPH1 and FPRL Many of the 
upregulated ORFs in the 'off-target' pathway were genes re- 
ported to be regulated by the transcriptional activator Gcn4 
(ref. 24). In some strains, a reporter gene under GCN4 control 
was induced in response to FK506 treatment 25 . To determine 
whether GCN4 is involved in this pathway that is independent 
of calcineurin, CPHJ and FPR1, we analyzed the effects of treat- 
ment with high-dose FK506 on global gene expression in a 
strain with a GCN4 deletion (Fig. 6). Of the 41 ORFs with cal- 
cineurin-independent expression ratios greater than 4,32 were 
not induced in the gcn4 mutant, indicating that their induction 
by FK506 was CC/V4-dependent. Not all CCN4-regulated genes 
were induced by FK506. This FK506-induced subset of GCN4- 
regulated genes may be those most sensitive to subtle changes 
in Gcn4 levels, or perhaps other regulatory circuits prevent 
FK506 activation of some CC/V4-regulated genes. Seven of the 
remaining nine ORFs induced by FK506 were independent of 

Fig. 5 Response of FK506 and CsA signature genes in strains with deletions 
in different genes. Genes with expression ratios greater than a factor of 1 .8 in 
response to treatment with 1 \ig/m\ FK506 (a) or 50 ug/ml CsA (b) are listed 
(left side) and their expression ratios in the indicated strain are shown on the 
green (induction)-red (repression) color scale, a, Calcineurin (cna) mutant 
and FK506 treatment signature genes are in the first two columns. Almost all 
FK506 signature genes have expression ratios near unity in deletion strains 
involved in pathways affected by FK506 (calcineurin, fprl and cna fprl mu- 
tants) but not in deletion strains in unrelated pathways (cphl). b, Calcineurin 
(cna) mutant and CsA treatment signature genes are in the first two 
columns. Almost all CsA signature genes have expression ratios near unity in 
deletion strains involved in pathways affected by CsA (calcineurin, cph7 and 
cna cphl mutants) but not in deletion strains in unrelated pathways {fprl). 



both the calcineurin and GCN4 pathways. The 
simplest explanation is that FK506 inhibits or 
activates additional pathways. Members of this 
class include SNQ2 and PDR5, genes that en- 
code drug efflux pumps with structural homol- 
ogy to mammalian multiple drug resistance 
proteins 26 . FK506 may interact directly with 
PdrS to inhibit its function 27 . Our results indi- 
cate that treatment with FK506 leads to four- 
fold-to-sixfold induction of PDRS mRNA levels. 
— YOR1, another gene that can confer drug resis- 
tance, is also induced threefold-to-fourfold by 
FK506. Thus, drug treatment of strains with mutations in the 
primary targets can prove useful in identifying effects mediated 
by secondary drug targets, including the nature and extent of 
newly discovered and previously unsuspected pathways af- 
fected by the drug. 

We describe here a method for drug target validation and the 
identification of secondary drug target effects that uses DNA mi- 
croarrays to survey the effects of drugs on global gene expres- 
sion patterns. We established that genetic and pharmacologic 
inhibition of gene function can result in extremely similar 
changes in gene expression. We also demonstrated that one can 
confirm a potential drug target by treating a deletion mutant 
defective in the gene encoding the putative target. Drug-medi- 
ated signatures from strains with mutations in pathways or 
processes directly or indirectly affected by the drug bore little or 
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Strain: 




FK506: 




CWPi 




TlPl 




YKL218C 




YJL171C 




YBR004C 




GYP7 




YPS3 
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Strain: 
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CWPi 




ARGi 
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TAFt«5 




CIT2 




SNZ1 




NUP133 




CflZi 




YOL015W 
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YOLQtTW 
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PMCl 
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YMB940W 
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Strain: 
FK506: 



Wt 
+ 



cna 
+ 



fprl 
+ 



cphl 
+ 



cna fprl gen4 
+ + 




Comptax tMhavlor class 

Ytft02*W _____ 
YMftWSW 



Fold repression 



Fold induction 



no similarity to the wild-type drug expression profile. In con- 
trast, drug-mediated signatures from strains with mutations in 
genes involved in pathways unrelated to the drug's action 
showed extensive similarity to the wild-type drug signature. By 
applying this approach to a drug that affects multiple pathways 
(FK506), we were able to decode a complex signature into com- 
ponent parts, including the identification of an 'off-targef sig- 
nature that was mediated through pathways independent of 
calcineurin or the Fprl immunophilin. 

Discussion 

It is well-established that high-throughput biochemical screen- 
ing can identify potent inhibitory compounds against a given 
target. The 'decoder' approach described here complements 
this process by evaluating the equally important property of 
specificity: the tendency of a compound to inhibit pathways 
other than that of its intended target. The ability to observe 
such 'off-target' effects will likely be useful in several ways. 
Profiling compounds with known toxicities will allow the de- 
velopment of a database of expression changes associated with 
particular toxicities. Recognition of potential toxicities in the 
*off-target' signatures of otherwise promising compounds then 
may allow earlier identification of those likely to fail in clinical 
trials. Comparing the extent and peculiarities of 'off-target' sig- 
natures of promising drug candiates could provide a new way 
to group compounds by their effects on secondary pathways, 
even before those effects are understood. This may prove to be 
an alternative, potentially more effective, way to select com- 
pounds for animal and clinical trials. Some drugs are more ef- 
fective against a related protein than against the originally 
intended target. Sildenafil (Viagra™), for example, was initially 
developed as a phosphodiesterase inhibitor to control cardiac 
contractility, but was found to be highly specific for phospho- 
diesterase 5, an isozyme whose inhibition overcomes defects in 



Fig. 6 Response of FK506 signature genes in strains with deletions 
in different genes. Genes with expression ratios greater than a factor 
of 4 in at least one experiment are listed and their expression ratios in 
the indicated strain are shown in the green (induction)-red (repres- 
sion) color scale. The genes have been divided into classes corre- 
sponding to these expected behaviors: 'C/M-dependent' genes 
respond to FK506 (50 ng/ml) except when either calcineurin genes or 
FPR1 or both are deleted; *GC/V4-dependent' genes respond to FK506 
except when GCN4 is deleted. These genes still respond to FK506 
when calcineurin genes or FPR1 or CP HI are deleted; that is, their re- 
sponses are not mediated by calcineurin. Cphl, or Fprl. y CNA- and 
GCN4- independent' genes respond to FK506 in all deletion strains 
tested. A 'complex behavior' class is provided for those genes that did 
not match the model of FK506 response mediated through cal- 
cineurin or Fprl or separately through Gcn4. 



penile erection. It is possible that application of the 'de- 
coder' to other compounds may show that they too have a 
potent activity against a target distinct from their in- 
tended target. 

The ability to decode drug effects is dependent on the 
availability of functionally 'targetless' cells. In yeast, this 
is being achieved by systematically disrupting each yeast 
gene {Saccharomyces Deletion Consortium; http://se- 
H quence-www.stanford.edu/group/yeast_deletion_pro- 

ject/deletion.html). Efforts are underway to obtain 
■i expression profiles from each deletion mutant strain. 
Determining signatures resulting from inactivation of es- 
sential genes presents a unique problem, but it may be 
possible to do so by examining heterozygotes or by using a con- 
trollable promoter to reduce expression of the essential gene. 
Although it is already feasible to test several compounds in 
dozens of yeast strains, another challenge for the 'decoder' 
strategy will be the efficient selection of the mutants with dele- 
tions in genes most likely to encode the intended drug target. 
The signature correlation plots described are one metric that 
could be used as part of that selection process, but others need 
to be explored. Applying the 'decoder' to mammalian cells pre- 
sents additional challenges. It is considerably more difficult to 
isolate functionally 'targetless' cells. Strategies involving titrat- 
able promoters, known specific inhibitors, anti-sense RNAs, ri- 
bozymes, and methods of targeting specific proteins for 
degradation are possible and should be tested. Another limita- 
tion is that not all cell types express the same set of genes and 
therefore 'off-target' effects may be different in different cell 
types. In addition, applying the 'decoder' to human cells will 
also require technical improvements that allow expression pro- 
filing from a small number of cells. Even the broader question 
of whether the insensitivity of 'off-target' signatures to the dis- 
ruption of the main target is the exception or the rule can only 
be answered by the accumulation of more data. Barkai and 
Leibler, however, have argued in favor of robustness of biologi- 
cal networks, indicating that drug perturbations ('off-target' 
signatures) may be robust even when the system is subjected to 
another perturbation (such as a genetic disruption) (ref. 28). 
Many practical developments will be necessary if the 'decoder' 
concept is to be broadly applied. 

Expression arrays have been used mainly as an initial screen 
for genes induced in a particular tissue or process of interest by 
focusing on genes with large expression ratios. We have 
found, however, that effort to refine experimental protocols 
and repeat experiments increases the reliability of the data and 
permits new applications. For example, it provides a larger set 
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Table 3 Yeast strains used 



Strain 


Relevant genotype 


Reference 


YPH499 


Mata ura3-52 Iys2-801 ade2-W1 trp1-A63 his3-A200 Ieu2-Al 


(34) 


R563 


Mata ura3-52 tys2-801 ade2-101 trp1-A63 his3-A200 Ieu2-Al his3::HIS3 


(this study) 


R558 


Mata ura3-52 Iys2-801 ade2-!01 trpl-A63 his3-A200 Ieu2-Al fpr1::HIS3 


(this study) 


R567 


Mata ura3-52 Iys2-801 ade2-W1 trphA63 his3-A200 Ieu2-A1 cph1::HIS3 


(this study) 


MCY300 


Mata ura3-52 Iys2-801 ade2-101 trphA63 his3-A200 teu2-Al cna1A1::hi$Gcna2A1::HIS3 


(21) 


R132 


Mata ura3-52 Iys2-801 ade2-101 trphA63 his3-A200 teu2~Al cnalAl::hisG cna2A1::HIS3 cph1::karf 


(this study) 


R133 


Mata ura3-52 tys2-801 ade2-101 trp1-A63 hi$3-A200leu2-Al cnaU1::hisGcna2A1::HIS3 fpr1::karf 


(this study) 


R559 


Mata ura3-52 Iys2-801 ade2-101 trp1-A63 his3-A200 Ieu2-Al his3::HIS3 gcn4::LEU2 


(this study) 


BY4719 


Mata trp 1 -A63 ura3-A0 


(35) 


BY4738 


Mata trpl-A 63 ura3-A0 


(35) 


R491 


Mata/aBY47l9 XBY4738 


(this study) 


BY4728 


Mata hi$3-A200 trp1-A63 ura3~A0 


(35) 


BY4729 


Mata Ns3-A 200 trp 7 ~A63 ura3-A0 


(35) 


R1226 


Mata/a BY4 728 X BY4729 


(this study) 



of genes at higher confidence levels that serve as a more 
unique signature for a given protein perturbation. In addition, 

g it allows subtle signatures to be detected, when, for example, a 

8 protein is only partially inhibited. This may enable clinical 

2 monitoring of small changes in protein function in disease or 

<S toxicity states before they could otherwise be detected. 

<jj Because the functions of many genes detected on transcript ar- 

!g rays are known, these microarrays are powerful tools that pro- 

| vide detailed information about a cell's physiology. For 

^ example, changes in the flux through a metabolic pathway are 

g reflected in transcriptional changes in genes in the pathway 7 . 

• Furthermore, it may be possible to indirectly measure protein 

S activity levels from expression profiling data (S.F., et a/. p un- 

8 published data). Thus, although the eventual development of 

| genomic methods allowing the direct measurement of all cel- 

< lular protein levels will be an important achievement, tran- 

§ script array technology offers an immediate and robust means 

« of evaluating the effects of various treatments on gene expres- 

co sion and protein function. 

SI Methods 

Construction, growth and drug treatment of yeast strains. The strains 
used in this study (Table 3) were constructed by standard techniques 29 . 
To construct strain R559, strain R563 was transformed to Leu* with plas- 
mid pM12 digested by 5a/l and Mlu\ (provided by A. Hinnebusch and T. 
Dever). Strains R1 32 and R133 were constructed by transforming the bac- 
terial kanamycin resistance cassette 30 flanked by genomic DNA from the 
CPH1 and FPR1 loci, respectively, and selecting for G418-resistant 
colonies. For experiments with FK506, cells were grown for three genera- 
tions to a density of 1 x TO 7 cells/ml in YAPD medium (YPD plus 0.004% 
adenine) supplemented with 10 mM calcium chloride as described 31 . 
Where indicated, FK506 was added to a final concentration of 1 ng/ml 
0.5 h after inoculation of the culture or to 50 ng/ml 1 h before cells were 
collected. CsA was used at a final concentration of 50 ug/ml. Cells were 
broken by standard procedures" with the following modifications: Cell 
pellets were resuspended in breaking buffer (0.2 M Tris HCI pH 7.6, 0.5 M 
NaCI, 10 mM EDTA, 1% SDS). vortexed for 2 min on a VWR multi-tube 
vortexer at setting 8 in the presence of 60% glass beads (425-600 \irr\ 
mesh; Sigma) and phenolxhloroform (50:50, volume/volume). After sep- 
aration of the phases, the aqueous phase was re-extracted and ethanol- 
precipitated. Poly A* RNA was isolated by two sequential 
chromatographic purifications over oligo dT cellulose (New England 
Biolabs, Beverly, Massachusetts) using established protocols". 

For experiments using 3-AT, wild-type or his3/his3 cells were grown to 
early logarithmic phase in SC medium, pelleted and resuspended in SC 
medium lacking histidine for 1 hr in the presence or absence of 10 mM 3- 



AT, as indicated. Cells were harvested and mRNA isolated as above. 
FK506 was obtained from the Swedish Hospital Pharmacy (Seattle, 
Washington) and purified to homogeneity by ethyl acetate extraction by 
J. Simon (Fred Hutchinson Cancer Research Center. Seattle, Washington). 
CsA was obtained from Alexis Biochemicats (San Diego, California); 3-AT 
was from Sigma. 

Preparation and hybridization of the labeled sample. Fluorescently-la- 
beled cDNA was prepared, purified and hybridized essentially as de- 
scribed 7 . Cy3- or Cy5-dUTP (Amersham) was incorporated into cDNA 
during reverse transcription (Superscript II; Life Technologies) and puri- 
fied by concentrating to less than 10 \i\ using Microcon-30 microconcen- 
trators (Amicon. Houston, Texas). Paired cDNAs were resuspended in 
20-26 nl hybridization solution (3 x SSC, 0.75 jig/ml polyA DNA, 0.2% 
SDS) and applied to the microarray under a 22- x 30-mm coverslip for 6 
h at 63 "C, all according to a published method 7 . 

Fabrication and scanning of microarrays. PCR products containing 
common 5" and 3' sequences (Research Genetics. Huntsville. Alabama) 
were used as templates with amino-modified forward primer and unmod- 
ified reverse primers to PCR amplify 6,065 ORFs from the S. cerevisiae 
genome. Our first-pass success rate was 94%. Amplification reactions that 
gave products of unexpected sizes were excluded from subsequent analy- 
sis. ORFs that could not be amplified from purchased templates were am- 
plified from genomic DNA. DNA samples from 100-^1 reactions were 
isopropanol- precipitated, resuspended in water, brought to a final con- 
centration of 3x SSC in a total volume of 15 pii, and transferred to 384- 
well microtiter plates (Genetix Limited. Christchurch, Dorset. England). 
PCR products were spotted onto 1 x 3-inch polylysine-treated glass slides 
by a robot built essentially according to defined specifications 3 * 7 
(http://cmgm.stanford.edu/pbrown/MGuide). After being printed, slides 
were processed according to published protocols 7 . 

Microarrays were imaged on a prototype multi-frame CCD camera in 
development at Applied Precision (Issaquah, Washington). Each CCD 
image frame was approximately 2-mm square. Exposure times of 2 s in 
the Cy5 channel (white light through Chroma 618-648 nm excitation fil- 
ter. Chroma 657-727 nm emission filter) and 1 s in the Cy3 channel 
(Chroma 535-560 nm excitation filter. Chroma 570-620 nm emission fil- 
ter) were done consecutively in each frame before moving to the next, 
spatially contiguous frame. Color isolation between the Cy3 and Cy5 
channels was about 100:1 or better. Frames were 'knitted' together in 
software to make the complete images. The intensity of spots (about 100 
jim) were quantified from the 10-jim pixels by frame-by-frame back- 
ground subtraction and intensity averaging in each channel. Dynamic 
range of the resulting spot intensities was typically a ratio of 1,000 be- 
tween the brightest spots and the background-subtracted additive error 
level. Normalization between the channels was accomplished by normal- 
izing each channel to the mean intensities of all genes. This procedure is 
nearly equivalent to normalization between channels using the intensity 
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ratio of genomic DNA spots 7 , but is possibly more robust, as it is based on 
the intensities of several thousand spots distributed over the array. 

Signature correlation coefficients and their confidence limits. 
Correlation coefficients between the signature ORFs of various experi- 
ments were calculated using: 

p^x^/ax^Sy. 2 )" 2 
k k k 

where x k is the log 10 of the expression ratio for the k* gene in the x signa- 
ture, and y k is the log,„ of the expression ratio for the k* gene in the y sig- 
nature. The summation is over those genes that were either up- or 
down-regulated in either experiment at the 95% confidence level. These 
genes each had a less than 5% chance of being actually unregulated (hav- 
ing expression ratios departing from unity due to measurement errors 
alone). This confidence level was assigned based on an error model which 
assigns a lognormal probability distribution to each gene's expression 
ratio with characteristic width based on the observed scatter in its re- 
peated measurements (repeated arrays at the same nominal experimental 
conditions) and on the individual array hybridization quality. This latter 
dependence was derived from control experiments in which both Cy3 
and Cy5 samples were derived from the same RNA sample. For large 
numbers of repeated measurements the error reduces to the observed 
scatter. For a single measurement the error is based on the array quality 
and the spot intensity. 

Random measurement errors in the x and y signatures tend to bias the 
correlation towards zero. In most experiments, most genes are not signif- 
icantly affected but do show small random measurement errors. Selecting 
only the '95% confidence' genes for the correlation calculation, rather 
than the entire genome, reduces this bias and makes the actual biological 
correlations more apparent. 

Correlations between a profile and itself are unity by definition. Error 
limits on the correlation are 95% confidence limits based on the individ- 
ual measurement error bars, and assuming uncorrelated errors". They do 
not include the bias mentioned above; thus, a departure of p from unity 
does not necessarily mean that the underlying biological correlation is im- 
perfect. However, a correlation of 0.7 ± 0.1, for example, is very signifi- 
cantly different from zero. Small (magnitude of p < 0.2) but formally 
significant correlation in the tables and text probably are due to small sys- 
tematic biases in the Cy5/Cy3 ratios that violate the assumption of inde- 
pendent measurement errors used to generate the 95% confidence 
limits. Therefore, these small correlation values should be treated as not 
significant. A likely source of uncorrected systematic bias is the partially 
corrected scanner detector nonlinearity that differently affects the Cy3 
and Cy5 detection channels. 

The 1 uxj/ml FK506 treatment signature was compared with more 
than 40 unrelated deletion mutant strain or drug signatures. These con- 
trol profiles had correlation coefficients with the FK506 profile that were 
distributed around zero (mean p -0.03) with a standard deviation of 
0.16 (data not shown), and none had correlations greater than p = 0.38. 
Similarly, the calcineurin mutant strain signature correlated well with the 
CsA treatment signature (p = 0.71 ± 0.04) but not with the signatures 
from the negative controls (mean p « -0.02 with a standard deviation of 
0.18). 

Quality controls. End-to-end checks on expression ratio measurement 
accuracy were provided by analyzing the variance in repeated hybridiza- 
tions using the same mRNA labeled with both Cy3 and Cy5, and also 
using Cy3 and Cy5 mRNA samples isolated from independent cultures of 
the same nominal strain and conditions. Biases undetected with this pro- 
cedure, such as gene-specific biases presumably due to differential incor- 
poration of Cy3- and Cy5-dUTP into cDNA. were minimized by doing 
hybridizations in flu or- re versed pairs, in which the Cy3/Cy5 labeling of 
the biological conditions was reversed in one experiment with respect to 
the other. The expression ratio for each gene is then the ratio of ratios be- 
tween the two experiments in the pair. Other biases are removed by algo- 
rithmic numerical de-trending. The magnitude of these biases in the 
absence of de-trending and fluor reversal is typically about 30% in the 
ratio, but may be as high as twofold for some ORFs. 

Expression ratios are based on mean intensities over each spot. Some 



smaller spots have fewer image pixels in the average. This does not de- 
grade accuracy noticeably until the number of pixels falls below ten, in 
which case the spot is rejected from the data set. 'Wander' of spot posi- 
tions with respect to the nominal grid is adaptively tracked in array sub- 
regions by the image processing software. Unequal spot 'wander' within 
a subregion greater than half-a-spot spacing is a difficulty for the auto- 
mated quantitating algorithms; in this case, the spot is rejected from 
analysis based on human inspection of the 'wander'. Any spots partially 
overlapping are excluded from the data set. Less than 1 % of spots typi- 
cally are rejected for these reasons. 
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The temporal program of gene expression during a model physiological re- 
sponse of human cells, the response of fibroblasts to serum, was explored with 
a complementary DNA microarray representing about 8600 different human 
genes. Genes could be clustered into groups on the basis of their temporal 
patterns of expression in this program. Many features of the transcriptional 
program appeared to be related to the physiology of wound repair, suggesting 
that fibroblasts play a larger and richer role in this complex multicellular 
response than had previously been appreciated. 



The response of mammalian fibroblasts to 
serum has been used as a model for studying 
growth control and cell cycle progression (V). 
Normal human fibroblasts require growth 
factors for proliferation in culture; these 
growth factors are usually provided by fetal 
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bovine serum (FBS). In the absence of 
growth factors, fibroblasts enter a nondivid- 
ing state, termed G 0 , characterized by low 



metabolic activity. Addition of FBS or puri- 
fied growth factors induces proliferation of 
the fibroblasts; the changes in gene expres- 
sion that accompany this proliferative re- 
sponse have been the subject of many studies, 
and the responses of dozens of genes to se- 
rum have been characterized. 

We took a fresh look at the response of 
human fibroblasts to serum, using cDNA mi- 
croarrays representing about 8600 distinct hu- 
man genes to observe the temporal program of 
transcription that underlies this response. Pri- 
mary cultured fibroblasts from human neonatal 
foreskin were induced to enter a quiescent state 
by serum deprivation for 48 hours and then 
stimulated by addition of medium containing 
10% FBS (2). DNA microarray hybridization 
was used to measure the temporal changes in 
mRNA levels of 8613 human genes (5) at 12 
times, ranging from 15 min to 24 hours after 
serum stimulation. The cDNA made from pu- 
rified mRNA from each sample was labeled 
with the fluorescent dye Cy5 and mixed with a 
common reference probe consisting of cDNA 
made from purified mRNA from the quiescent 
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Fig. 1. The same section of 
the microarray is shown 
for three independent hy- 
bridizations comparing RNA 
isolated at the 8-hour time 
point after serum treat- 
ment to RNA from serum- 
deprived cells. Each mi- 
croarray contained 9996 
elements, including 9804 
human cDNAs, represent- 
ing 8613 different genes. 
mRNA from serum-de- 
prived cells was used to 
prepare cDNA labeled with 
Cy3-deoxyuridine triphosphate (dUTP), and mRNA harvested from cells at different times after serum 
stimulation was used to prepare cDNA labeled with Cy5-dUTP. The two cDNA probes were mixed and 
simultaneously hybridized to the microarray. The image of the subsequent scan shows genes whose 
mRNAs are more abundant in the serum-deprived fibroblasts (that is, suppressed by serum treatment) 
as green spots and genes whose mRNAs are more abundant in the serum-treated fibroblasts as red 
spots. Yellow spots represent genes whose expression does not vary substantially between the two 
samples. The arrows indicate the spots representing the following genes: 1, protein disulfide isomerase- 
related protein P5; 2, IL-8 precursor; 3, EST AA05717O; and 4, vascular endothelial growth factor. 
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culture (time zero) labeled with a second fluo- 
rescent dye, Cy3 (4). The color images of the 
hybridization results (Fig. 1) were made by 
representing the Cy3 fluorescent image as 
green and the Cy5 fluorescent image as red and 
merging the two color images. 

Diverse temporal profiles of gene expres- 
sion could be seen among the 8613 genes sur- 



veyed in this experiment (Fig. 2); many of these 
genes (about half) were unnamed expressed 
sequence tags (ESTs) (5). Although diverse 
patterns of expression were observed, the order- 
ly choreography of the expression program be- 
came apparent when the results were analyzed 
by a clustering and display method developed 
in our laboratory for analyzing genome-wide 



Fig. Z Cluster image 
showing the different 
classes of gene expres- 
sion profiles. Five hun- 
dred seventeen genes 
whose mRNA levels 
changed in response to 
serum stimulation were 
selected (7). This sub- 
set of genes was clus- 
tered hierarchically into 
groups on the basis of 
the similarity of their 
expression profiles by 
the procedure of Eisen 
et ai (6~). The expres- 
sion pattern of each 
gene in this set is dis- 
played here as a hori- 
zontal strip. For each 
gene, the ratio of 
mRNA levels in fibro- 
blasts at the indicat- 
ed time after serum 
stimulation ("unsync" 
denotes exponentially 
growing cells) to its 
level in the serum-de- 
prived (time zero) fi- 
broblasts is represented 
by a color, according to 
the color scale at the 
bottom. The graphs 
show the average ex- 
pression profiles for the 
genes in the corre- 
sponding "cluster" (in- 
dicated by the letters A 
to J and color coding). 
In every case examined, 
when a gene was rep- 
resented by more than 
one array element, the 
multiple representa- 
tions in this set were 
seen to have identical 
or very similar expres- 
sion profiles, and the 
profiles corresponding 
to these independent 
measurements clus- 
tered either adjacent 
or very dose to each 
other, pointing to the 
robustness of the clus- 
tering algorithm in 
grouping genes with 
very similar patterns of 
expression. 
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gene expression data (6). An example of such 
an analysis, here applied to a subset of 517 
genes whose expression changed substantially 
in response to serum (7), is shown in Fig. 2. 
The entire detailed data set underlying Fig. 
2 is available as a tab-delimited table (in 
cluster order) at the Science Web site (www. 
sciencemag.org/feature/data/984559.shl). In 
addition, the entire, larger data set for the 
complete set of genes analyzed in this exper- 
iment can be found at a Web site maintained 
by our laboratory (genome-www.stanford. 
edu/serum) (8). 

One measure of the reliability of the 
changes we observed is inherent in the ex- 
pression profiles of the genes. For most genes 
whose expression levels changed, we could 
see a gradual change over a few time points, 
which thus effectively provided independent 
measurements for almost all of the observa- 
tions. An additional check was provided by 
the inclusion of duplicate and, in a few cases, 
multiple array elements representing the 
same gene for about 5% of the genes included 
in this microarray. In addition, three indepen- 
dent hybridizations to different microarrays 
with mRNA samples from cells harvested 8 
hours after serum addition showed good cor- 
relation (Fig. 1). As an independent test, we 
measured the expression levels of several 
genes using the TaqMan 5' nuclease fluori- 
genic quantitative polymerase chain reaction 
(PCR) assay (P). The expression profiles of 
the genes, as measured by these two indepen- 
dent methods, were very similar (Fig. 3) (70). 

The transcriptional response of fibroblasts 
to serum was extremely rapid. The immediate 
response to serum stimulation was dominated 
by genes that encode transcription factors 
and other proteins involved in signal trans- 
duction. The mRNAs for several genes [in- 
cluding c-FOS, JUN B, and mitogen-acti- 
vated protein (MAP) kinase phosphatase- 1 
(MKP1)] were detectably induced within 
15 min after serum stimulation (Fig. 4, A 
and B). Fifteen of the genes that were 
observed to be induced by serum encode 
known or suspected regulators of transcrip- 
tion (Fig. 4B). All but one were immediate- 
early genes — their induction was not inhib- 
ited by cycloheximide (77). This class of 
genes could be distinguished into those 
whose induction was transient (Fig. 2, clus- 
ter E) and those whose mRNA levels re- 
mained induced for much longer (Fig. 2, 
clusters I and J). Some features of the 
immediate response appeared to be directed 
at adaptation to the initiating signals. We 
observed a marked induction of mRNA 
encoding MFCP1, a dual-specificity phos- 
phatase that modulates the activity of the 
ERK1 and ERK2 MAP kinases (72). The 
coincidence of the peak of expression of 
genes in cluster E (Fig. 2) with that of 
MKP1 (Fig. 4A) suggests the possibility 
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that continued activity of the MAP kinase path- 
way is required to maintain induction of these 
genes but not of those with sustained expression 
(clusters I and J). The gene encoding a second 
member of the dual-specificity MAP kinase 
phosphatase family, known as dual-specificity 
protein phosphatase 6/pyst2, was induced later, 
at about 4 hours after serum stimulation. Genes 
encoding diverse other proteins with roles in 
signal transduction, ranging from cell-surface 
receptors [for example, the sphingosine 1- 
phosphate receptor (EDG-1), the vascular en- 
dothelial growth factor receptor, and the type II 
BMP receptor] to regulators of G-protein sig- 
naling (for example, NETl/pll5 rho GEF) to 
DNA-binding transcription factors, were in- 
duced by serum (Fig. 4 A). 

The reprogramming of the regulatory cir- 
cuits in response to serum involved not only 
induction of transcription factors but also re- 
duced expression of many transcriptional reg- 
ulators — some of which may play roles in 
maintaining the cells in G 0 or in priming 
them to react to wounding (Fig. 4C). Perhaps 
as a consequence of the historical focus on 
genes induced by serum stimulation of fibro- 
blasts, the set of transcription factors whose 
expression diminished upon serum stimula- 
tion has been less well characterized. 

Genes known or likely to be involved in 
controlling and mediating the proliferative re- 
sponse showed distinctive patterns of regula- 
tion. Several genes whose products inhibit pro- 
gression of the cell -division cycle, such as p27 
Kipl , p57 Kip2, and p] 8, were expressed in the 
quiescent fibroblasts and down-regulated be- 
fore the onset of cell division. The nadir in the 
mRNA levels for these genes occurred between 
6 and 12 hours after serum stimulation (Fig. 
5A), coincident with the passage of the fibro- 
blasts through G,. The levels of the transcript 
encoding the WEE 1 -like protein kinase, which 
is believed to inhibit mitosis by phosphoryl- 
ation of Cdc2, diminished between 4 and 8 to 
12 hours after serum addition (Fig. 5 A), well 



before the onset of M phase at around 16 hours, 
raising the possibility of an additional role for 
Weel in an earlier stage of the cell cycle or in 
regulating the G 0 to G, transition. Several 
genes induced in the first few hours after serum 
stimulation, such as the helix-loop-helix pro- 
teins ID2 and 1D3 and EST AA016305, a gene 
with homology to G,-S cyclins, are candidates 
for roles in promoting the exit from G 0 . 

Genes involved in mediating progression 
through the cell cycle were characterized by a 
distinctive pattern of expression (Fig. 2, clus- 
ter D), reflecting the coincidence of their 
expression with the reentry of the stimulated 
fibroblasts into the cell-division cycle. The 
stimulated fibroblasts replicated their DNA 
about 16 hours after serum treatment. This 
timing was reflected by the induction of 
mRNA encoding both subunits of ribonucle- 
otide reductase and PCNA, the processivity 
factor for DNA polymerase epsilon and delta. 
Cyclin A, Cyclin Bl, Cdc2, and CDC28 ki- 
nase, regulators of passage through the S 
phase and the transition from G 2 to M phase, 
were induced at about 16 to 20 hours after 
serum addition. The kinase in the Cyclin 
Bl-CDK pair needs to be activated by phos- 
phorylation. The gene encoding Cyclin-de- 
pendent kinase 7 (CDK7; a homolog of Xe- 
nopus M015 cdk-activating kinase) was in- 
duced in parallel with the Cdc2 and Cdc28 
kinases (Fig. 5A), suggesting a potential role 
for CDK7 in mediating M phase. DNA topo- 
isomerase II a, required for chromosome seg- 
regation at mitosis; Mad2, a component of 
the spindle checkpoint that prevents comple- 
tion of mitosis (anaphase) if chromosomes 
are not attached to the spindle; and the kinet- 
ochore protein CENP-F all showed a similar 
expression profile. 

In the hours after the serum stimulus, one of 
the most striking features of the unfolding tran- 
scriptional program was the appearance of nu- 
merous genes with known roles in processes 
relevant to the physiology of wound healing. 
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These included both genes involved in the di- 
rect role played by fibroblasts in remodeling of 
the clot and the extracellular matrix and, more 
notably, genes encoding proteins involved in 
intercellular signaling (Fig. 5). Genes induced 
in this program encode products that can (i) 
participate in the dynamic process of clotting, 
clot dissolution, and remodeling and perhaps 
contribute to hemostasis by promoting local 
vasoconstriction (for example, endothelin-1); 
(ii) promote chemotaxis and activation of neu- 
trophils (for example, COX2) and recruitment 
and extravasation of monocytes and macro- 
phages (for example, MCP1); (iii) promote 
chemotaxis and activation of T lymphocytes 
[for example, interleukin-8 (IL-8)] and B 
lymphocytes (for example, ICAM-1), thus 
providing both innate and antigen-specific 
defenses against wound infection and recruit- 
ing the phagocytic cells that will be required 
to clear out the debris during remodeling of 
the wound; (iv) promote angiogenesis and 
neovascularization (for example, VEGF) 
through newly forming tissue; (v) promote 
migration and proliferation of fibroblasts (for 
example, CTGF) and their differentiation into 
myofibroblasts (for example, Vimentin); and 
(vi) promote migration and proliferation of 
keratinocytes, leading to reepithelialization 
of the wound (for example, FGF7), and pro- 
mote proliferation of melanocytes, perhaps 
contributing to wound hyperpigmentation 
(for example, FGF2). 

Coordinated regulation of groups of genes 
whose products act at different steps in a 
common process was a recurring theme. For 
example, Furin, a prohormone-processing 
protease required for one of the processing 
steps in the generation of active endothelin, 
was induced in parallel with induction of the 
gene encoding the precursor of endothelin-1 
(Fig. 5E) (75). Conversely, expression of 
CALLA/CD10, a membrane metalloprotease 
that degrades endothelin-1 and other peptide 
mediators of acute inflammation, was re- 
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Fig. 3. Independent verification of microanray quantitation. Relative mRNA 
levels of the indicated genes (Mast, mast/stem cell growth factor receptor) 
were measured with the TaqMan 5' nuclease fluorigenic quantitative PCR 
assay (9) (left) in the same samples that were used to prepare probes for 
microarray hybridizations (right). Data from the TaqMan analysis were 
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normalized to mRNA concentrations and plotted relative to the level at 
time zero, so that the results could be compared with those from the 
microarray hybridizations. In general, quantitation with the two methods 
gave very similar results (70). 
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duced. A second example is provided by a set 
of five genes involved in the biosynthesis of 
cholesterol (Fig. 51). The mRNAs encoding 
each of these enzymes showed sharply dimin- 
ished expression beginning 4 to 6 hours after 
serum stimulation of fibroblasts. A likely ex- 
planation for the coordinated down-regula- 
tion of the cholesterol biosynthetic pathway 
is that serum provides cholesterol to fibro- 
blasts through low-density lipoproteins, 
whereas in the absence of the cholesterol 
provided by serum, endogenous cholesterol 
biosynthesis in fibroblasts is required. 

Many of the previously studied genes that 
we observed to be regulated in this program 
have no recognized role in any aspect of wound 
healing or fibroblast proliferation. Their identi- 
fication in this study may therefore point to 
previously unknown aspects of these processes. 
A few selected genes in this group are shown in 
Fig. 5H. The stanniocalcin gene, for example 
(Fig. 5H), encodes a secreted protein without a 
clearly identified function in human cells (J 4, 
15). Its induction in serum-stimulated fibro- 
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Fig. 4. "Reprogramrning" of fibroblasts. Expres- 
sion profiles of genes whose function is likely to 
play a role in the reprogramrning phase of the 
response are shown with the same representa- 
tion as in Fig. 2. In the cases in which a gene 
was represented by more than one element in 
the microarray, all measurements are shown. 
The genes were grouped into categories on the 
basis of our knowledge of their most likely role. 
Some genes with pleiotropic roles were includ- 
ed in more than one category. 



blasts suggests the possibility that it may play a 
role in the wound-healing process, perhaps 
serving as a signal in mediating inflammation 
or angiogenesis. 

One of the most important results of this 
exploration was the discovery of over 200 pre- 
viously unknown genes whose expression was 
regulated in specific temporal patterns during 
the response of fibroblasts to serum. For exam- 
ple, 1 3 of the 40 genes in cluster D (Fig. 2) have 
descriptive names that reflect their putative 
function. Nine of these 1 3 genes (69%) encode 
proteins that play roles in cell cycle progres- 
sion, particularly in DNA replication and the 
G 2 -M transition. This enrichment for cell 
cycle-related genes suggests that some of the 
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unnamed genes in this cluster — for example, 
EST W79311 and EST R13146, neither of 
which have sequence similarity to previously 
characterized genes — may represent previously 
unknown genes involved in this part of the cell 
cycle. Similarly, a remarkable fraction of genes 
that were grouped into cluster F on the basis of 
their expression profiles encoded proteins in- 
volved in intercellular signaling (Fig. 2), sug- 
gesting that a similar role should be considered 
for the many unnamed genes in this cluster. A 
disproportionately large fraction of the genes 
whose transcription diminished upon serum 
stimulation were unnamed ESTs. 

Our intention was to use this experiment as 
a model to study the control of the transition 
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Fig. 5. The transcriptional response to serum suggests a multifaceted role for fibroblasts in the 
physiology of wound healing. The features of the transcriptional program of fibroblasts in response 
to serum stimulation that appear to be related to various aspects of the wound-healing process and 
fibroblast proliferation are shown with the same convention for representing changes in transcript 
levels as was used in Figs. 2 and 4. (A) Cell cycle and proliferation, (B) coagulation and hemostasis, 
(C) inflammation, (D) angiogenesis, (E) tissue remodeling, (F) cytoskeletal reorganization. (C) 
reepithelialization, (H) unidentified role in wound healing, and (I) cholesterol biosynthesis. The 
numbers in (C) and (C) refer to genes whose products serve as signals to neutrophils (CI), 
monocytes and macrophages (C2), T lymphocytes (C3), B lymphocytes (C4), and melanocytes (CI). 
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from G 0 to a proliferating state. However, one 
of the defining characteristics of genome-scale 
expression profiling experiments is that the ex- 
amination of so many diverse genes opens a 
window on all the processes that actually occur 
and not merely the single process one intended 
to observe. Serum, the soluble fraction of clot- 
ted blood, is normally encountered by cells in 
vivo in the context of a wound. Indeed, the 
expression program that we observed in re- 
sponse to serum suggests that fibroblasts are 
programmed to interpret the abrupt exposure to 
serum not as a general mitogenic stimulus but 
as a specific physiological signal, signifying a 
wound. The proliferative response that we orig- 
inally intended to study appeared to be part of a 
larger physiological response of fibroblasts to a 
wound. Other features of the transcriptional 
response to serum suggest that the fibroblast is 
an active participant in a conversation among 
the diverse cells that work together in wound 
repair, interpreting, amplifying, modifying, and 
broadcasting signals controlling inflammation, 
angiogenesis, and epithelial regrowth during 
the response to an injury. 

We recognize that these in vitro results 
almost certainly represent a distorted and in- 
complete rendering of the normal physiolog- 
ical response of a fibroblast to a wound. 
Moreover, only the responses elicited directly 
by exposure of fibroblasts to serum were 
examined. The subsequent signals from other 
cellular participants in the normal wound- 
healing process would certainly provoke fur- 
ther evolution of the transcriptional program 
in fibroblasts at the site of a wound, which 
this experiment cannot reveal. Nevertheless, 
we believe that the picture that emerged 
strongly suggests a much larger and richer 
role for the fibroblast in the orchestration of 
this important physiological process than had 
previously been suspected. 

References and Notes 

1. J. A. Winkles, Prog. Nucleic Acid Res. Moi Biot. SB. 41 
(1998). 

2. A normal human diploid fibroblast cell line derived from 
foreskin (ATCC CRL 2091) in passage 8 was used in 
these experiments. The protocol followed for growth 
arrest and stimulation was essentially that of (76) and 
(77). Cells were grown to about 60% confluence in 
15-cm petri dishes in Dulbecco's minimum essential 
medium containing gtucose (1 g/liter), the antibiotics 
penicillin and streptomycin, and 10% (by vol) FBS (Hy- 
ctone) that had been previously heat inactivated at 
56°C for 30 min. The cells were then washed three 
times with the same medium tacking FBS, and low- 
serum medium (0.1% FBS) was added to the plates. 
After a 48-hour incubation, the medium was replaced 
with fresh medium containing 10% FBS. mRNA was 
isolated from several plates of celts harvested before 
serum stimulation: this mRNA served as the serum- 
starved or time-zero reference sample. Cells were har- 
vested from batches of plates at 1 1 subsequent inter- 
vals (15 min. 30 min, 1, 2. 4, 6. 8, 12, 16, 20. and 24 
hours) after the addition of serum mRNA was also 
isolated from exponentially growing fibroblasts (not 
subjected to serum starvation). mRNA was isolated 
with the FastTrack mRNA isolation kit (Invftrogen). 
which involves lysis of the cells on the plate. The growth 
medium was removed, and the cells were quickly 



washed with phosphate-buffered saline at room tem- 
perature. The lysis buffer was added to the plate, trans- 
ferred to tubes, and frozen in liquid nitrogea Subse- 
quent steps were performed according to the kit man- 
ufacturer's protocols. 

3. The National Center for Biotechnology Information 
maintains the UniCene database as a resource for par- 
titioning human sequences contained in GenBank into 
clusters representing distinct transcripts or genes (78, 
79). At the time this work began, this database con- 
tained about 40,000 such clusters. We selected a subset 
of 10.000 of these UniCene clusters for inclusion on 
gene expression microarrays. UniCene clusters were 
included only if they contained at least one clone from 
the IMAGE, human cDNA collection (20), so that a 
physical clone could easily be obtained (all I.MAG.E. 
clones are available commercially from a number of 
vendors). We attempted to include as complete as 
possible a set of the "named" human genes (about 
4000) and genes that appeared to be closely related to 
named genes in other organisms (about an additional 
2000). The remaining 4000 clones were chosen from 
among the "anonymous" UniCene clusters on the basis 
of inclusion on the human transcript map (wwwncbi. 
nlranih.gov/SCIENCE96/) and the lack of apparent ho- 
mology to any other genes in the selected set A phys- 
ical clone representing each of the selected genes was 
obtained from Research Genetics. This "10K set" is 
included in a more recent "15K set" described at www. 
nhgri.nihgov/DIRACC/1 5K/HTMl/p 1 5Ktop.html Of 
these clones. 472 are absent from the current edition of 
UniGene and were presumed to be distinct genes. The 
remainders represent 8141 distinct clusters, or human 
genes, in UniGene. These clones, thus presumed to 
represent 8613 different genes, were used to print 
microarrays according to methods described previously 
(27,22). 

4. One microgram of mRNA was used for making fluores- 
ce ntly labeled cDNA probes for hybridizing to the mi- 
croarrays. with the protocol described previously (23). 
mRNA from the large batch of serum-starved cells was 
used to make cDNA labeled with Cy3. The Cy3-labeled 
cDNA from this batch of serum-starved cells served as 
the common reference probe in all hybridizations. 
mRNA samples from cells harvested immediately be- 
fore serum stimulation, at intervals after serum stimu- 
lation, and from exponentially growing cells were used 
to make cDNA labeled with CyS. Ten micrograms of 
yeast tRNA, 10 jig of polydeoxyadenyuc acid, and 20 
p.g of human CoTI DNA (Gibco-BRL) were added to the 
mixture of labeled probes in a solution containing 3X 
standard saline citrate (SSC) and 0.3% SDS and allowed 
to prehybridize at room temperature for 30 min before 
the probe was added to the surface of the microarray. 
Hybridizations, washes, and fluorescent scans were per- 
formed as described previously (23, 24). All measure- 
ments, totaling more than 180.000 differential expres- 
sion measurements, were stored in a computer data- 
base for analysis and interpretation. 

5. The nominal identities of a number of cDNAs (currently 
about 3750) on the microarray were verified by se- 
quencing. The clones that were sequenced included 
many of the genes whose expression changed substan- 
tially upon serum stimulation, as well as a large number 
of genes whose expression did not change substantially 
in the course of this experiment About 85% of the 
clones on the current version of this microarray that 
were checked by re sequencing were correctly identified 
In all the figures, gene names or EST numbers are given 
only for those genes on the microarrays whose identi- 
ties were reconfirmed by resequencing. In the cases 
where a human gene has more than one name in the 
literature, we have tried to use the name that is most 
evocative of its presumed role in this context The 
remainder of the clones have been assigned a tempo- 
rary identification number (format SID######) and a 
putative identity pending sequence verificatioa The 
correct identities of these genes will be posted at our 
Web site (genome-www.stanford.edu/serum) as they 
are confirmed by resequencing. 

6. M. B. Eisen, P. T. Speltman, P. O. Brown, D. Botstein, 
Proc. Natt. Acad. So. U.S.A. 95, 14863 (1998). 

7. Genes were selected for this analysis if either (i) their 
expression level deviated from that in quiescent fibro- 
blasts by at least a factor of 2.20 in at least two of the 



samples from serum-stimulated cells or (ii) the standard 
deviation for the set of 13 values of ^(expression 
ratio) measured for the gene in this time course exceed- 
ed 0.7. In addition, observations in which the pixel-by- 
pixel correlation coefficients for the Cy3 and CyS fluo- 
rescence signals measured in a given array element 
were less than 0.6 were excluded. This selection criteria 
yielded a computationally manageable number of genes 
while minimizing the number of genes that were includ- 
ed because of noise in the data. 

8. A more complete analysis and interpretation of the 
results of this experiment, as well as a searchable da- 
tabase, can be found at genome-wwwstanford.edu/ 
serum 

9. K. J. livak. S. J. Flood. J. Marmaro, W. Giusti, K. Deetz, 
PCR Methods. Appl. 4. 357 (1995). 

10. The apparent dip in the profile of COX2 at the 2-hour 
time point in the microarray data appears to result 
from a localized area of low intensity on the corre- 
sponding array scan resulting in an underestimation 
of the expression ratio. The expression ratios mea- 
sured for mast/stem cell growth factor receptor are 
somewhat lower in the microarray data. This discrep- 
ancy is probably a consequence of the conservative 
background subtraction method used for quantitat- 
ing the signal intensities on the array scans (23). The 
sequences of the PCR primer pairs (5' to 3') that 
were used are as follows: COX2, CCGTGGCTCTCTT- 
GGCAG and CTAAGT TCT T TAGCACTCCT TG CCA; II- 
8, CGATGCTGTGGAGCTGTATC and CCATGGTTTC- 
ACCAAAGATG; mast/stem cell factor receptor, ACA- 
GAAGCCCGTGGTAGACC and GAGGCT GGGAGG AG- 
GAAG; B4-2, AAACCCCCCTCAGGAAAGAG and CC- 
ATGAACAAGCTGGCCAT; and actin, ACT ACT CCGT GT- 
GGATCGGC and GCTGATCCACATCTGCTGGA 

11. V. R. Iyer et ai, unpublished data. The gene expres- 
sion data for the early time points in the presence of 
cycloheximide will be available at our Web site 
(genome-wwwstanford.edu/serum) 

12. T. Hunter. Cell 80, 225 (1995). 

13. J. Leppaluoto and H. Ruskoaho, Ann. Med. 24, 153 
(1992). 

14. A. C Chang et ai. Moi Cell. Endocrinol 112, 241 
(1995). 

15. K. L Madsen et ai, Am. J. Physiol. 274, G96 (1998). 

16. W. Krek and J. A. DeCaprio, Methods Enzymot. 254, 
114 (1995). 

1 7. R, A. Tobey, J. G. Valdez. H. A. Crissman, Exp. Cell Res. 
179, 400 (1988). 

18. M. S. Boguski and G. D. Schuler, Nature Cenet. 10, 
369 (1995). 

19. G. D. Schuler, J. Moi Med. 75. 694 (1997). 

20. G. Lennon, C Auffray, M. Potymeropoulos, M. B. 
Soares, Genomics 33. 151 (1996). 

21. I.MAG.E. clones were amplified by PCR in 96-wetl 
format with amino-linked primers at the 5' end Purified 
PCR products were suspended at a concentration of 
—0.5 mg/ml in 3X SSC and ~5 ng of each product was 
arrayed onto coated glass by means of procedures 
similar to those described previously (22). A total of 
9996 elements were arrayed onto an area of 1.8 cm by 
1.8 cm with the elements spaced 175 u-m apart The 
microarrays were then postprocessed to fix the DNA to 
the glass surface before hybridization with a procedure 
similar to previously described methods (22). 

22. M. Schena, 0. Shalon, R. W. Davis, P. O. Brown, 
Science 270. 467 (1995). 

23. J. L DeRisi. V. R. Iyer, P. O. Brown, ibid. 278, 680 
(1997). 

24. J. DeRisi et ai. Nature Cenet 14. 457 (1996). 

25. We thank L Chung for help with sequencing, A. Aliza- 
deh for help with sequence verification, K. Ranade for 
advice on the TaqMan assay, and J. DeRisi and other 
members of the P.O.B. and D.B. labs for discussions. 
Supported by a grant from the National Human Ge- 
nome Research Institute (NHGRI) (HG00450) and the 
National Cancer Institute (NIH CA 77097). V.RJ. was 
supported in part by an Institutional Training Grant 
in Genome Sciences (T32 HG00044) from the 
NHGRI. M.B.L is an Alfred E. Sloan Foundation Postdoc- 
toral Fellow in Computational Molecular Biology, and 
D.T.R is a Walter and Idun Berry Fellow. P.O.B. is an 
Associate Investigator of the Howard Hughes Medical 
Institute. 

13 August 1998; accepted 13 November 1998 



www.sciencemag.org SCIENCE VOL 283 1 JANUARY 1999 



87 



Exhibit E of Iyer Declaration i 

with Response dated 03/1 8/04 J 

In USSN: 10/031,904 J 

Sfl © 2000 Nature America Inc. • http://genetics.nature.com vii \, l!Cl€ 



Systematic variation in gene expression 
patterns in human cancer cell lines 

Douglas T. Ross 1 , Uwe Scherf 5 , Michael B. Eisen 2 , Charles M. Perou 2 , Christian Rees 2 , Paul Spellman 2 , 
Vishwanath Iyer 1 , Stefanie S. Jeffrey 3 , Matt Van de Rijn 4 , Mark Waltham 5 , Alexander Pergamenschikov 2 , 
Jeffrey CR Lee 6 , Deval Lashkari 7 , Dari Shalon 6 , Timothy G. Myers 8 , John N. Weinstein 5 , David Botstein 2 
& Patrick O. Brown 1 ' 9 

We used cDNA microarrays to explore the variation in expression of approximately 8,000 unique genes among the 
60 cell lines used in the National Cancer Institute's screen for anti-cancer drugs. Classification of the cell lines based 
solely on the observed patterns of gene expression revealed a correspondence to the ostensible origins of the 
tumours from which the cell lines were derived. The consistent relationship between the gene expression patterns 
and the tissue of origin allowed us to recognize outliers whose previous classification appeared incorrect. Specific 
features of the gene expression patterns appeared to be related to physiological properties of the cell lines, such 
as their doubling time in culture, drug metabolism or the interferon response. Comparison of gene expression pat- 
terns in the cell lines to those observed in normal breast tissue or in breast tumour specimens revealed features of 
the expression patterns in the tumours that had recognizable counterparts in specific cell lines, reflecting the 
tumour, stromal and inflammatory components of the tumour tissue. These results provided a novel molecular 
characterization of this important group of human cell lines and their relationships to tumours in vivo. 



Introduction 

Cell lines derived from human tumours have been extensively used 
as experimental models of neoplastic disease. Although such cell 
lines differ from both normal and cancerous tissue, the inaccessi- 
bility of human tumours and normal tissue makes it likely that 
such cell lines will continue to be used as experimental models for 
the foreseeable future. The National Cancer Instituted Develop- 
mental Therapeutics Program (DTP) has carried out intensive 
studies of 60 cancer cell lines (the NCI60) derived from tumours 
from a variety of tissues and organs 1 " 4 . The DTP has assessed many 
molecular features of the cells related to cancer and chemothera- 
peutic sensitivity, and has measured the sensitivities of these 60 cell 
lines to more than 70,000 different chemical compounds, includ- 
ing all common chemotherapeutics (http://dtp.nci.nih.gov). A 
previous analysis of these data revealed a connection between the 
pattern of activity of a drug and its method of action. In particular, 
there was a tendency for groups of drugs with similar patterns of 
activity to have related methods of action 3,5 "" 7 . 

We used DNA microarrays to survey the variation in abun- 
dance of approximately 8,000 distinct human transcripts in these 
60 cell lines. Because of the logical connection between the func- 
tion of a gene and its pattern of expression, the correlation of gene 
expression patterns with the variation in the phenotype of the cell 
can begin the process by which the function of a gene can be 
inferred. Similarly, the patterns of expression of known genes can 



reveal novel phenotypic aspects of the cells and tissues studied*"*" 10 . 
Here we present an analysis of the observed patterns of gene 
expression and their relationship to phenotypic properties of the 
60 cell lines. The accompanying report 1 1 explores the relationship 
between the gene expression patterns and the drug sensitivity pro- 
files measured by the DTP. The assessment of gene expression pat- 
terns in a multitude of cell and tissue types, such as the diverse set 
of cell lines we studied here, under diverse conditions in vitro and 
in vivo, should lead to increasingly detailed maps of the human 
gene expression program and provide clues as to the physiological 
roles of uncharacterized genes 11-16 . The databases, plus tools for 
analysis and visualization of the data, are available (http://genome- 
www.stanford.edu/nci60 and http://discover.nci.nih.gov). 

Results 

We studied gene expression in the 60 cell lines using DNA 
microarrays prepared by robotically spotting 9,703 human 
cDNAs on glass microscope slides 17,18 . The cDNAs included 
approximately 8,000 different genes: approximately 3,700 repre- 
sented previously characterized human proteins, an additional 
1,900 had homologues in other organisms and the remaining 
2,400 were identified only by ESTs. Due to ambiguity of the iden- 
tity of the cDNA clones used in these studies, we estimated that 
approximately 80% of the genes in these experiments were cor- 
rectly identified. The identities of approximately 3,000 cDNAs 
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Fig. 1 Gene expression patterns related to the tissue of origin of the cell lines. Two-dimen- 
sional hierarchical clustering was applied to expression data from a set of 1,161 cDNAs 
measured across 64 cell lines. The 1,161 cDNAs were those (of 9,703 total) with transcript 
levels that varied by at least sevenfold (log 2 (ratio) >2.8) relative to the reference pool in at 
least 4 of 60 cell lines. This effectively selected genes with the greatest variation in expres- 
sion level across the 60 cell lines (including those genes not well represented in the refer- 
ence pool), and therefore highlighted those gene expression patterns that best 
distinguished the cell lines from one another. Data from 64 hybridizations were used, one 
for each cell line plus the two additional independent representations of each of the cell 
lines K562 and MCF7. The two cell lines represented in triplicate were correspondingly 
weighted for the gene clustering so that each of the 60 cell lines contributed equally to the 
clustering, a. The cell-line dendrogram, with the terminal branches coloured to reflect the 
ostensible tissue of origin of the cell line (red, leukaemia; green, colon; pink, breast; pur- 
ple, prostate; light blue, lung; orange, ovarian; yellow, renal; grey, CNS; brown, melanoma; 
black, unknown (NCI/ADR-RES)). The scale to the right of the dendrogram depicts the cor- 
relation coefficient represented by the length of the dendrogram branches connecting 
pairs of nodes. Note that the two triplets of replicated cell lines (K562 and MCF7) cluster 
tightly together and were well differentiated from even the most closely related cell lines, 
indicating that this clustering of cell lines is based on characteristic variations in their gene 
expression patterns rather than artefacts of the experimental procedures, b, A coloured 
representation of the data table, with the rows (genes) and columns (cell lines) in cluster 
order. The dendrogram representing hierarchical relationships between genes was omit- 
ted for clarity, but is available (http7/genome-www.stanf ord.edu/nci60). The colour in each 
cell of this table reflects the mean-adjusted expression level of the gene (row) and cell line 
(column). The colour scale used to represent the expression ratios is shown. The labels 
'3a-3d' in (b) refer to the clusters of genes shown in detail in Fig. 3. 
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from these experiments have been sequence- verified, including 
all of those referred to here by name. 

Each hybridization compared Cy5-labelled cDNA reverse tran- 
scribed from mRNA isolated from one of the cell lines with Cy3- 
labelled cDNA reverse transcribed from a reference mRNA 
sample. This reference sample, used in all hybridizations, was 
prepared by combining an equal mixture of mRNA from 12 of 
the cell lines (chosen to maximize diversity in gene expression as 
determined primarily from two-dimensional gel studies 2 ). By 
comparing cDNA from each cell line with a common reference, 
variation in gene expression across the 60 cell lines could be 
inferred from the observed variation in the normalized Cy5/Cy3 
ratios across the hybridizations. 

To assess the contribution of artefactual sources of variation in 
the experimentally measured expression patterns, K562 and 
MCF7 cell lines were each grown in three independent cultures, 
and the entire process was carried out independently on mRNA 
extracted from each culture. The variance in the triplicate fluo- 
rescence ratio measurements approached a minimum when the 
fluorescence signal was greater than approximately 0.4% of the 
measurable total signal dynamic range above background in 
either channel of the hybridization. We selected the subset of 
spots for which significant signal was present in both the numer- 
ator and denominator of the ratios by this criterion to identify 
the best-measured spots. The pair-wise correlation coefficients 
for the triplicates of the set of genes that passed this quality con- 
trol level (6,992 spots included for the MCF7 samples and 6,161 
spots for K562) ranged from 0.83 to 0.92 (for graphs and details, 
see http://genome-www.stanford.edu/nci60). 

To make the orderly features in the data more apparent, we used 
a hierarchical clustering algorithm 19 ' 20 and a pseudo-colour visu- 



alization matrix 3,21 . The object of the clustering was to group cell 
lines with similar repertoires of expressed genes and to group 
genes whose expression level varied among the 60 cell lines in a 
similar manner. Clustering was performed twice using different 
subsets of genes to assess the robustness of the analysis. In one case 
(Fig. 1), we concentrated on those genes that showed the most 
variation in expression among the 60 cell lines (1,167 total). A sec- 
ond analysis (Fig. 2) included all spots that were thought to be well 
measured in the reference set (6,831 spots). 

Gene expression patterns related to the histologic 
origins of the cell lines 

The most notable property of the clustered data was that cell lines 
with common presumptive tissues of origin grouped together 
(Figs la and 2). Cell lines derived from leukaemia, melanoma, 
central nervous system, colon, renal and ovarian tissue were clus- 
tered into independent terminal branches specific to their respec- 
tive organ types with few exceptions. Cell lines derived from 
non-small lung carcinoma and breast tumours were distributed 
in multiple different terminal branches suggesting that their gene 
expression patterns were more heterogeneous. 

Many of these coherent cell line clusters were distinguished by 
the specific expression of characteristic groups of genes 
(Fig. 3a-d). For example, a cluster of approximately 90 genes was 
highly expressed in the melanoma-derived lines (Fig. 3c). This set 
was enriched for genes with known roles in melanocyte biology, 
including tyrosinase and dopachrome tautomerase (TYR and 
DCT; two subunits of an enzyme complex involved in melanin 
synthesis 22 ), MARTI (MLANA; which is being investigated as a 
target for immunotherapy of melanoma 23 ) and S100-P (S100B; 
which has been used as an antigenic marker in the diagnosis of 
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Fig. 2 Gene expression patterns related to 
other cell-line phenotypes. a, We applied 
two-dimensional hierarchical clustering to 
expression data from a set of 6,831 cDNAs 
measured across the 64 cell lines. The 6,831 
cONAs were those with a minimum fluores- 
cence signal intensity of approximately 0.4% 
of the dynamic range above background in 
the reference channel in each of the six 
hybridizations used to establish reproducibil- 
ity. This effectively selected those spots that 
provided the most reliable ratio measure- 
ments and therefore identified a subset of 
genes useful for exploring patterns comprised 
of those whose variation in expression across 
the 60 cell lines was of moderate magnitude. 
b. Cluster-ordered data table, c. Doubling 
time of cell lines. Cell lines are given in cluster 
order. Values are plotted relative to the mean. 
Doubling times greater than the mean are 
shown in green, those with doubling time less 
than the mean are shown in red. d, Three 
related gene clusters that were enriched for 
genes whose expression level variation was 
correlated with cell line proliferation rate. 
Each of the three gene clusters (clustered 
solely on the basis of their expression pat- 
terns) showed enrichment for sets of genes 
involved in distinct functional categories (for 
example, ribosomal genes versus genes 
involved in pre-RNA splicing), e. Gene cluster 
in which all characterized and sequence-veri- 
fied cDNAs encode genes known to be regu- 
lated by interferons, f, Gene cluster enriched 
for genes that have been implicated in drug 
metabolism (indicated by asterisks). A further 
property of the gene clustering evident here 
and in Fig. 2 is the strong tendency for redun- 
dant representations of the same gene to 
cluster immediately adjacent to one another, 
even within larger groups of genes with very 
similar expression patterns. In addition to 
illustrating the reproducibility and consis- 
tency of the measurements, and providing 
independent confirmation of many of our 
measurements, this property also demon- 
strates that these, and probably all, genes 
have nearly unique patterns of variation 
across the 60 cell lines. If this were not the 
case, and multiple genes had identical pat- 
terns of variation, we would not expect to be 
able to distinguish, by clustering on the basis 
of expression variation, duplicate copies of 
individual genes from the other genes with 
identical expression patterns. 
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melanoma). LOXIMVI, the seventh line designated as melanoma 
in the NCI60, did not show this characteristic pattern. Although 
isolated from a patient with melanoma, LOXIMVI has previously 
been noted to lack melanin and other markers useful for identifi- 
cation of melanoma cells 1 . 

Paradoxically, two related cell lines (MDA-MB435 and MDA- 
N), which were derived from a single patient with breast cancer 
and have been conventionally regarded as breast cancer cell lines, 
shared expression of the genes associated with melanoma. MDA- 
MB435 was isolated from a pleural effusion in a patient with 
metastatic ductal adenocarcinoma of the breast 24 ' 25 . It remains 
possible that the origin of the cell line was a breast cancer, and that 
its gene expression pattern is related to the neuroendocrine fea- 
tures of some breast cancers 26 . But our results suggest that this cell 
line may have originated from a melanoma, raising the possibility 
that the patient had a co-existing occult melanoma. 

The higher-level organization of the cell-line tree — in which 
groups span cell lines from different tissue types — also reflected 
shared biological properties of the tissues from which the cell 
lines were derived. The carcinoma-derived cell lines were divided 
into major branches that separated those that expressed genes 
characteristic of epithelial cells from those that expressed genes 
more typical of stromal cells. A cluster of genes is shown (Fig. 3b) 
that is most strongly expressed in cell lines derived from colon 
carcinomas, six of seven ovarian-derived cell lines and the two 
breast cancer lines positive for the oestrogen receptor. The named 
genes in this cluster have been implicated in several aspects of 
epithelial cell biology 27 . The cluster was enriched for genes whose 
products are known to localize to the basolateral membrane of 
epithelial cells, including those encoding components of 
adherens complexes (for example, desmoplakin (DSP), 
periplakin (PPL) and plakoglobin (JUP)), an epithelial- 
expressed cell-cell adhesion molecule (M4S1) and a sodium/ 
hydrogen ion exchanger 28 " 31 (SLC9A1). It also contained genes 
that encode putative transcriptional regulators of epithelial mor- 
phogenesis, a human homologue of a Drosophila melanogaster 
epithelial -expressed tumour suppressor (LLGL1) and a homeo- 
box gene thought to control calcium-mediated adherence in 
epithelial cells 32 - 33 (MSX2). 

In contrast, a separate, major branch of the cell-line dendro- 
gram (Fig. la) included all glioblastoma-derived cell lines, all 
renal- cell -carcinoma-derived cell lines and the remaining carci- 
noma-derived lines. The characteristic set of genes expressed in 
this cluster included many whose products are involved in stro- 
mal cell functions (Fig. 3d). Indeed, the two cell lines originally 
described as 'sarcoma-like' in appearance (Hs578T, breast carci- 
nosarcoma, and SF539, gliosarcoma) expressed most of these 
genes 34 ' 35 . Although no single gene was uniformly characteristic 
of this cluster, each cell line showed a distinctive pattern of 
expression of genes encoding proteins with roles in synthesis or 
modification of the extracellular matrix (for example, caldesmon 
(CALD1), cathepsins, thrombospondin (THBS), lysyl oxidase 
(LOX) and collagen subtypes). Although the ovarian and most 
non-small-cell-lung-derived carcinomas expressed genes charac- 
teristic of both epithelial cells and stromal cells, they probably 
clustered with the CNS and renal cell carcinomas in this analysis 
because genes characteristically expressed in stromal ceils were 
more abundantly represented in this gene set. 

Physiological variation reflected 
in gene expression patterns 

A cluster diagram of 6,831 genes (Fig. 2) is useful for exploring 
clusters of genes whose variation in mRNA levels was not obvi- 
ously attributable to cell or tissue type. We identified some gene 
clusters that were enriched for genes involved in specific cellular 



processes; the variation in their expression levels may reflect cor- 
responding differences in activity of these processes in the cell 
lines. For example, a cluster of 1,159 genes (Fig. 2a) included 
many whose products are necessary for progression through the 
cell cycle (such as CCNA1, MCM106 and MAD2L1), RNA pro- 
cessing and translation machinery (such as RNA helicases, 
hnRNPs and translation elongation factors) and traditional 
pathologic markers used to identify proliferating cells (MKI67). 
Within this large cluster were smaller clusters enriched for genes 
with more specialized roles. One cluster was highly enriched for 
numerous ribosomal genes, whereas another was more enriched 
for genes encoding RNA-splicing factors. The variation in 
expression of these ribosomal genes was significantly correlated 
with variation in the cell doubling time (correlation coefficient of 
0.54), supporting the notion that the genes in this cluster were 
regulated in relation to cell proliferation rate or growth rate in 
these cell lines. 

In a smaller gene cluster (Fig. 2d) y all of the named genes were 
previously known to be regulated by interferons 13 * 36 . Additional 
groups of interferon-regulated genes showed distinct patterns of 
expression (data not shown), suggesting that theNCI60 cell lines 
exhibited variation in activity of interferon- response pathways, 
which was reflected in gene expression patterns 36 . 

Another cluster (Fig. 2e) contained several genes encoding 
proteins with possible interrelated roles in drug metabolism, 
including glutamate-cysteine ligase (GLCLC, the enzyme respon- 
sible for the rate limiting step of glutathione synthesis), thiore- 
doxin (TXN) and thioredoxin reductase (TXNRD1; enzymes 
involved in regulating redox state in cells), and MRP1 (a drug 
transporter known to efficiendy transport glutathione- conju- 
gated compounds 37 ). The elevated expression of this set of genes 
in a subset of these cell lines may reflect selection for resistance to 
chemotherapeutics. 

Cell lines facilitate interpretation of gene expression 
patterns in complex clinical samples 

Like many other types of cancer, tumours of the breast typically 
have a complex histological organization, with connective tissue 
and leukocytic infiltrates interwoven with tumour cells. To 
explore the possibility that variation in gene expression in the 
tumour cell lines might provide a framework for interpreting the 
expression patterns in tumour specimens, we compared RNA 
isolated from two breast cancer biopsy samples, a sample of nor- 
mal breast tissue and the NCI60 cell lines derived from breast 
cancers (excluding MDA-MB-435 and MDA-N) and leukaemias 
(Fig. 4). This clustering highlighted features of the gene expres- 
sion pattern shared between the cancer specimens and individual 
cell lines derived from breast cancers and leukaemias. 

The genes encoding keratin 8 (KRT8) and keratin 19 (KRT19), 
as well as most of the other 'epithelial' genes defined in the com- 
plete NCI60 cell line cluster, were expressed in both of the biopsy 
samples and the two breast-derived cell lines, MCF-7 and T47D, 
expressing the oestrogen receptor, suggesting that these tran- 
scripts originated in tumour cells with features similar to those of 
luminal epithelial cells (Fig. So). Expression of a set of genes char- 
acteristic of stromal cells, including collagen genes (COL3A1, 
COL5A1 and COL6A1) and smooth muscle cell markers 
(TAGLN), was a feature shared by the tumour sample and the 
stromal-like cell lines Hs578T and BT549 (Fig. 5b). This feature 
of the expression pattern seen in the tumour samples is likely to 
be due to the stromal component of the tumour. The tumours 
also shared expression of a set of genes (Fig. 5c) with the multiple 
myeloma cell line (RPMI-8226), notably including 
immunoglobulin genes, consistent with the presence of B cells 
in the tumour (this was confirmed by staining with anti- 
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Fig. 3 Gene clusters related to tissue characteristics in the cell lines. Enlargements of the regions of the cluster diagram in Fig. 1 showing gene clusters enriched 
for genes expressed in cell lines of ostensibly similar origins, a. Cluster of genes highly expressed in the leukaemia-derived cell lines. Two sub-clusters distinguish 
genes that were expressed in most leukaemia-derived lines from those expressed exclusively in the eryroblastoid line, K562 (note that the triplicate hybridiza- 
tions cluster together), b. Cluster of genes highly expressed in all colon (7/7) cell lines and all breast-derived cell lines positive for the oestrogen receptor (2/2). This 
set of genes was also moderately expressed in most ovarian lines (5/6) and some non-small-cell-lung (4/6) lines, but was expressed at a lower level in all renal-can- 
cer-derived lines, c. Cluster of genes highly expressed in most melanoma-derived lines (6/7) and two related lines ostensibly derived from breast cancer (MDA- 
MB435 and MDA-N). d, Cluster of genes highly expressed in all glioblastoma (6/6) lines and most lines derived from renal-cell carcinoma (7/8), and more 
moderately expressed in a subset of carcinoma-derived lines. In all panels, names are shown only for all known genes whose identities were independently re- 
verified by sequencing. The number of sequence-validated ESTs within the cluster is indicated below the cluster in parentheses. The position of gene names in the 
adjacent list only approximates their position in the cluster diagram as indicated by the lines connecting the colour chart with the gene list. Complete cluster 
images with all gene names and accession numbers are available (http://genome-vvvw.stanford.edu/nci60). 
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immunoglobulin antibodies; data not shown). Therefore, dis- 
tinct sets of genes with co-varying expression among the samples 
(Fig. 4, arrow) appear to represent distinct cell types that can be 
distinguished in breast cancer tissue. A fourth cluster of genes, 
more highly expressed in all of the cell lines than in any of the 
clinical specimens, was enriched for genes present in the 'prolif- 
eration* cluster described above (Fig. Sd). The variation in 
expression of these genes likely paralleled the difference in prolif- 
eration rate between the rapidly cycling cultured cell lines and the 
much more slowly dividing cells in tissues. 

Discussion 

Newly available genomics tools allowed us to explore variation in 
gene expression on a genomic scale in 60 cell lines derived from 
diverse tumour tissues. We used a simple cluster analysis to iden- 
tify the prominent features in the gene expression patterns that 
appeared to reflect 'molecular signatures* of the tissue from 
which the cells originated. The histological characteristics of the 
cell lines that dominated the clustering were pervasive enough 
that similar relationships were revealed when alternative subsets 
of genes were selected for analysis. Additional features of the 
expression pattern may be related to variation in physiological 
attributes such as proliferation rate and activity of interferon- 
response pathways. 

The properties of the tumour- derived cell lines in this study 
have presumably all been shaped by selection for resistance to 
host defences and chemotherapeutics and for rapid proliferation 
in the tissue culture environment of synthetic growth media, fetal 
bovine serum and a polystyrene substratum. But the primary 
identifiable factor accounting for variation in gene expression 
patterns among these 60 cell lines was the identity of the tissue 
from which each cell line was ostensibly derived. For most of the 
cell lines we examined, neither physiological nor experimental 
adaptation for growth in culture was sufficient to overwrite the 
gene expression programs established during differentiation in 
vivo. Nevertheless, the prominence of mesenchymal features in 
the cell lines isolated from glioblastomas and carcinomas may 
reflect a selection for the relative ease of establishment of cell 
lines expressing stromal characteristics, perhaps combined with 
physiological adaptation to tissue culture conditions 38 " -40 . 



Fig. 4 Comparison of the gene expression patterns in clinical breast cancer 
specimens and cultured breast cancer and leukaemia cell lines, a. Two-dimen- 
sional hierarchical clustering applied to gene expression data for two breast 
cancer specimens, a lymph node metastasis from one patient, normal breast 
and the NCI 60 breast and leukaemia-derived ceil lines. The gene expression 
data from tissue specimens was clustered along with expression data from a 
subset of the NCJ60 cell lines to explore whether features of expression pat- 
terns observed in specific lines could be identified in the tissue samples. Labels 
indicate gene clusters (shown in detail in Fig. 5) that may be related to specific 
cellular components of the tumour specimens, b, Breast cancer specimen 16 
stained with anti-keratin antibodies, showing the complex mix of cell types 
characteristically found in breast tumours. The arrows highlight the different 
cellular components of this tissue specimen that were distinguished by the 
gene expression cluster analysis (Fig. 5). 



Biological themes linking genes with related expression pat- 
terns may be inferred in many cases from the shared attributes of 
known genes within the clusters. Uncharacterized cDNAs are 
likely to encode proteins that have roles similar to those of the 
known gene products with which they appear to be co-regulated. 
Still, for several clusters of genes, we were unable to discern a com- 
mon theme linking the identified members of the cluster. Further 
exploration of their variation in expression under more diverse 
conditions and more comprehensive investigation of the physiol- 
ogy of the NCI60 cells may provide insight 10 . The relationship of 
the gene expression patterns to the drug sensitivity patterns mea- 
sured by the DTP is an example of linking variation in gene 
expression with more subtle and diverse phenotypic variation 1 x . 

The patterns of gene expression measured in the NCI 60 cell 
lines provide a framework that helps to distinguish the cells that 
express specific sets of genes in the histologically complex breast 
cancer specimens 41 . Although it is now feasible to analyse gene 
expression in micro-dissected tumour specimens 42,43 , this obser- 
vation suggests that it will be possible to explore and interpret 
some of the biology of clinical tumour samples by sampling them 
intact. As is useful in conventional morphological pathology, one 
might be able to observe interactions between a tumour and its 
microenvironment in this way. These relationships will be clari- 
fied by suitable analysis of gene expression patterns from intact as 
well as dissected tumours 12 ' 14 ' 1 5,41 . 

Methods 

cDNA clones. We obtained the 9,703 human cDNA clones (Research Genet- 
ics) used in these experiments as bacterial colonies in 96-well microtitre 
plates 9 . Approximately 8,000 distinct Unigene clusters (representing nomi- 
nally unique genes) were represented in this set of clones. All genes identi- 
fied here by name represent clones whose identities were confirmed by re- 
sequencing, or by the criteria that two or more independent cDNA clones 
ostensibly representing the same gene had nearly identical gene expression 
patterns. A single-pass 3' sequence re-verification was attempted for every 
clone after re-streaking for single colonies. For a subset of genes for which 
quality 3' sequence was not obtained, we attempted to confirm identities by 
5' sequencing. Of the subset of clones selected for 5' sequence verification 
on the basis of an interesting pattern of expression (888 total), 331 were cor- 
rectly identified, 57, incorrectly identified, and 500, indeterminate (poor 
quality sequence). We estimated that 1 5%-20% of array elements contained 
DNA representing more than one clone per well. So far, the identities of 
-3,000 clones have been verified. The full list of clones used and their nomi- 
nal identities are available (gene names preceded by the designation M SID#" 
(Stanford Identification) represent clones whose identities have not yet been 
verified; http://genome-www.stan ford. edu:8000/nci60). 

Production of cDNA microarrays. The arrays used in this experiment were 
produced at Synteni Inc. (now Incyte Pharmaceuticals). Each insert was 
amplified from a bacterial colony by sampling 1 \i\ of bacterial media and 
performing PCR amplification of the insert using consensus primers for 
the three plasm ids represented in the clone set ( 5 '-TTGTAAAACG ACG 
GCCAGTG-3', 5'-CACACAGGAAACAGCTATG-3'). Each PCR product 
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(100 was purified by gel exclusion, concentrated and resuspended in 
3xSSC (10 ul). The PCR products were then printed on treated glass 
microscope slides using a robot with four printing tips. Detailed protocols 
for assembling and operating a microarray printer, and printing and exper- 
imental application of DNA rnicroarrays are available (http://cmgm. 
stanford.edu/pbrown). 

Preparation of mRNA and reference pool. Cell lines were grown from NCI 
DTP frozen stocks in RPMM640 supplemented with phenol red, glutamine 
(2 mM) and 5% fetal calf serum. To minimize the contribution of variations 
in culture conditions or cell density to differential gene expression, we grew 
each cell line to 80% confluence and isolated mRNA 24 h after transfer to 
fresh medium. The time between removal from the incubator and lysis of the 
cells in RNA stabilization buffer was minimized (<1 min). Cells were lysed in 
buffer containing guanidium isothiocyanate and total RNA was purified 
with the RNeasy purification kit (Qiagen). We purified mRNA as needed 



using a poly(A) purification kit (Oligotex, Qiagen) according to the manu- 
facturer's instructions. Denaturing agarose gel electrophoresis assessed the 
integrity and relative contamination of mRNA with ribosomal RNA. 

The breast tumours were surgically excised from patients and rapidly 
transported to the pathology laboratory, where samples for microarray 
analysis were quickly frozen in liquid nitrogen and stored at -80 °C until 
use. A frozen tumour specimen was removed from the freezer, cut into 
small pieces (-50-100 mg each), immediately placed into 10-12 ml of Tri- 
zol reagent (Gibco-BRl) and homogenized using a PowerGen 125 Tissue 
Homogenizer (Fisher Scientific), starting at 5,000 r.p.m. and gradually 
increasing to -20,000 r.p.m. over a period of 30-60 s. We processed the Tri- 
zol/tumour homogenate as described in the Trizol protocol, including an 
initial step to remove fat. Once total RNA was obtained, we isolated mRNA 
with a FastTrack 2.0 kit (Invitrogen) using the manufacturer's protocol for 
isolating mRNA starting from total RNA. The normal breast samples were 
obtained from Clontech. 
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Fig. 5 Histologic features of breast cancer biopsies can be recognized and parsed based on gene expression patterns. Enlargements of the regions of the cluster 
diagram in Fig. 4 showing gene clusters enriched for genes expressed in different cell types in the breast cancer specimens, as distinguished by clustering with the 
cultured cell lines, a, A cluster including many genes characteristic of epithelial cells expressed in cell lines (T47D and MCF7) derived from breast cancer positive for 
the oestrogen receptor and tumours, b. Genes expressed in cell lines derived from breast cancer with stromal cell characteristics (Hs578T and BT549) and tumour 
specimens. Expression of these genes in the tumour samples may reflect the presence of myofibroblasts in the cancer specimen stroma, c, Genes expressed in leuko- 
cyte-derived cell lines, showing common leukocyte, and separate 'myeloid' and 'B-ceil', gene clusters, d, Genes that were relatively highly expressed in all cell lines 
compared with the tumour specimens and normal breast. The higher expression of this set of genes involved in cell cycle transit in the cell lines is likely to reflect the 
higher proliferative rate of cells cultured in the presence of serum compared with the average proliferation rate of cells in the biopsied tissue. 
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We combined mRNA from the following cells in equal quantities to 
make the reference pool: HL-60 (acute myeloid leukaemia) and K562 
(chronic myeloid leukaemia); NCI-H226 (non-small-cell-lung); COLO 
205 (colon); SNB-19 (central nervous system); LOX-1MVI (melanoma); 
OVCAR-3 and OVCAR-4 (ovarian); CAKI-1 (renal); PC-3 (prostate); and 
MCF7 and Hs578T (breast). The criterion for selection of the cell lines in 
the reference are described in detail in the accompanying manuscript 12 . 

Doubling-time calculations. We calculated doubling times based on rou- 
tine NCI60 cell line compound screening data; and they reflect the dou- 
bling times for cells inoculated into 96- well plates at the screening inocula- 
tion densities and grown in RPMI 1640 medium supplemented with 5% 
fetal bovine serum for 48 h. We measured cell populations using sulforho- 
damine B optical density measurement assay. The doubling time constant k 
was calculated using the equation: N/No = where No is optical density 
for control (untreated) cells at time zero, N is optical density for control cells 
after 48-h incubation, and t is 48 h. The same equation was then used with the 
derived k to calculate the doubling time t by setting N/No = 2. For a given cell 
line, we obtained No and N values by averaging optical densities (N>6,000) 
obtained for each cell line for a year's screening. Data and experimental details 
are available (http://dtp.nci.nih.gov). 

Preparation and hybridization of fluorescent labelled cDNA. For each 
comparative array hybridization, labelled cDNA was synthesized by reverse 
transcription from test cell mRNA in the presence of Cy5-dUTP, and from 
the reference mRNA with Cy3-dUTP, using the Superscript II reverse-tran- 
scription kit (Gibco-BRL). For each reverse transcription reaction, mRNA 
(2 ug) was mixed with an anchored oligo-dT (d-20T-d(AGC)) primer (4 
ug) in a total volume of 15 ul, heated to 70 °C for 10 min and cooled on ice. 
To this sample, we added an unlabelled nucleotide pool (0.6 ul; 25 mM 
each dATP, dCTP, dGTP, and 15 mM dTTP), either Cy3 or Cy5 conjugated 
dUTP (3 ul; 1 mM; Amersham), Sxfirst-strand buffer (6 ul; 250 mM Tris- 
HCL, pH 8.3, 375 mM KC1, 15 mM MgCl 2 ), 0.1 M DTT (3 ul) and 2 ul of 
Superscript II reverse transcriptase (200 u/ul). After a 2-h incubation at 42 
°C, the RNA was degraded by adding 1 N NaOH (1.5 ul) and incubating at 
70 °C for 10 min. The mixture was neutralized by adding of 1 N HCL (1.5 
ul), and the volume brought to 500 ul with TE ( 10 mM Tris, 1 mM EDTA). 
We added Cotl human DNA (20 ug; Gibco-BRL), and purified the probe 
by centrifugation in a Centricon-30 micro-concentrator (Amicon). The 
two separate probes were combined, brought to a volume of 500 ul, and 
concentrated again to a volume of less than 7 ul. We added 1 0 ug/ul 
poly(A) RNA (1 ul; Sigma) and tRNA (10 ug/ul; Gibco-BRL) were added, 
and adjusted the volume to 9.5 ul with distilled water. For final probe 
preparation, 20xSSC (2.1 ul; 1.5 M NaCI, 150 mM NaCitrate, pH 8.0) and 
10% SDS (0.35 ul) were added to a total final volume of 1 2 ul. The probes 
were denatured by heating for 2 min at 100 °C, incubated at 37 °C for 
20-30 min, and placed on the array under a 22 mmx22 mm glass coverslip. 
We incubated slides overnight at 65 °C for 14-18 h in a custom slide cham- 
ber with humidity maintained by a small reservoir of 3xSSC. Arrays were 
washed by submersion and agitation for 2-5 min in 2xSSC with 0.1 % SDS, 
followed by lxSSC and then O.lxSSC. The arrays were "spun dry" by cen- 
trifugation for 2 min in a slide-rack in a Beckman GS-6 tabletop centrifuge 
in Microplus carriers at 650 r.p.m. for 2 min. 

Array quantitation and data processing. Following hybridization, arrays 
were scanned using a laser-scanning microscope (ref. 17; http://cmgm. 
stanford.edu/pbrown). Separate images were acquired for Cy3 and Cy5. We 
carried out data reduction with the program ScanAlyze (M.B.E., available 



at http://rana.stanford.edu/software). Each spot was defined by manual 
positioning of a grid of circles over the array image. For each fluorescent 
image, the average pixel intensity within each circle was determined, and a 
local background was computed for each spot equal to the median pixel 
intensity in a square of 40 pixels in width and height centred on the spot 
centre, excluding all pixels within any defined spots. Net signal was deter- 
mined by subtraction of this local background from the average intensity 
for each spot. Spots deemed unsuitable for accurate quantitation because 
of array artefacts were manually flagged and excluded from further analy- 
sis. Data files generated by ScanAlyze were entered into a custom database 
that maintains web-accessible files. Signal intensities between the two fluo- 
rescent images were normalized by applying a uniform scale factor to all 
intensities measured for the Cy5 channel. The normalization factor was 
chosen so that the mean log(Cy3/Cy5) for a subset of spots that achieved a 
minimum quality parameter (approximately 6,000 spots) was 0. This effec- 
tively defined the signal-intensity-weighted 'average* spot on each array to 
have a Cy3/Cy5 ratio of 1 .0. 

Cluster analysis. We extracted tables (rows of genes, columns of individual 
microarray hybridizations) of normalized fluorescence ratios from the data- 
base. Various selection criteria, discussed in relation to each data set, were 
applied to select subsets of genes from the 9,703 cDNA elements on the 
arrays. Before clustering and display, the logarithm of the measured fluores- 
cence ratios for each gene were centred by subtracting the arithmetic mean of 
all ratios measured for that gene. The centring makes all subsequent analyses 
independent of the amount of each gene's mRNA in the reference pool. 

We applied a hierarchical clustering algorithm separately to the cell lines 
and genes using the Pearson correlation coefficient as the measure of simi- 
larity and average linkage clustering 3 * 19 " 21 . The results of this process are 
two dendrograms (trees), one for the cell lines and one for the genes, in 
which very similar elements are connected by short branches, and longer 
branches join elements with diminishing degrees of similarity. For visual 
display the rows and columns in the initial data table were reordered to 
conform to the structures of the dendrograms obtained from the cluster 
analysis. Each cell in the cluster-ordered data table was replaced by a graded 
colour (pure red through black to pure green), representing the mean- 
adjusted ratio value in the cell. Gene labels in cluster diagrams are dis- 
played here only for genes that were represented in the microarray by 
sequence-verified cDNAs. A complete software implementation of this 
process is available (http://rana.stan ford.edu/software), as well as all clus- 
tering results (http://genome-www.stanford.edu/nci60). 
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